ARA-C01 Sample Questions Answers

Questions 4

A retail company has over 3000 stores all using the same Point of Sale (POS) system. The company wants to deliver near real-time sales results to category managers. The stores operate in a variety of time zones and exhibit a dynamic range of transactions each minute, with some stores having higher sales volumes than others.

Sales results are provided in a uniform fashion using data engineered fields that will be calculated in a complex data pipeline. Calculations include exceptions, aggregations, and scoring using external functions interfaced to scoring algorithms. The source data for aggregations has over 100M rows.

Every minute, the POS sends all sales transactions files to a cloud storage location with a naming convention that includes store numbers and timestamps to identify the set of transactions contained in the files. The files are typically less than 10MB in size.

How can the near real-time results be provided to the category managers? (Select TWO).

Options:

All files should be concatenated before ingestion into Snowflake to avoid micro-ingestion.

A Snowpipe should be created and configured with AUTO_INGEST = true. A stream should be created to process INSERTS into a single target table using the stream metadata to inform the store number and timestamps.

A stream should be created to accumulate the near real-time data and a task should be created that runs at a frequency that matches the real-time analytics needs.

An external scheduler should examine the contents of the cloud storage location and issue SnowSQL commands to process the data at a frequency that matches the real-time analytics needs.

The copy into command with a task scheduled to run every second should be used to achieve the near-real time requirement.

Buy Now

Answer:

B, C

Explanation:

To provide near real-time sales results to category managers, the Architect can use the following steps:

Create an external stage that references the cloud storage location where the POS sends the sales transactions files. The external stage should use the file format and encryption settings that match the source files2

Create a Snowpipe that loads the files from the external stage into a target table in Snowflake. The Snowpipe should be configured with AUTO_INGEST = true, which means that it will automatically detect and ingest new files as they arrive in the external stage. The Snowpipe should also use a copy option to purge the files from the external stage after loading, to avoid duplicate ingestion3

Create a stream on the target table that captures the INSERTS made by the Snowpipe. The stream should include the metadata columns that provide information about the file name, path, size, and last modified time. The stream should also have a retention period that matches the real-time analytics needs4

Create a task that runs a query on the stream to process the near real-time data. The query should use the stream metadata to extract the store number and timestamps from the file name and path, and perform the calculations for exceptions, aggregations, and scoring using external functions. The query should also output the results to another table or view that can be accessed by the category managers. The task should be scheduled to run at a frequency that matches the real-time analytics needs, such as every minute or every 5 minutes.

The other options are not optimal or feasible for providing near real-time results:

All files should be concatenated before ingestion into Snowflake to avoid micro-ingestion. This option is not recommended because it would introduce additional latency and complexity in the data pipeline. Concatenating files would require an external process or service that monitors the cloud storage location and performs the file merging operation. This would delay the ingestion of new files into Snowflake and increase the risk of data loss or corruption. Moreover, concatenating files would not avoid micro-ingestion, as Snowpipe would still ingest each concatenated file as a separate load.

An external scheduler should examine the contents of the cloud storage location and issue SnowSQL commands to process the data at a frequency that matches the real-time analytics needs. This option is not necessary because Snowpipe can automatically ingest new files from the external stage without requiring an external trigger or scheduler. Using an external scheduler would add more overhead and dependency to the data pipeline, and it would not guarantee near real-time ingestion, as it would depend on the polling interval and the availability of the external scheduler.

The copy into command with a task scheduled to run every second should be used to achieve the near-real time requirement. This option is not feasible because tasks cannot be scheduled to run every second in Snowflake. The minimum interval for tasks is one minute, and even that is not guaranteed, as tasks are subject to scheduling delays and concurrency limits. Moreover, using the copy into command with a task would not leverage the benefits of Snowpipe, such as automatic file detection, load balancing, and micro-partition optimization. References:

1: SnowPro Advanced: Architect | Study Guide

2: Snowflake Documentation | Creating Stages

3: Snowflake Documentation | Loading Data Using Snowpipe

4: Snowflake Documentation | Using Streams and Tasks for ELT

: Snowflake Documentation | Creating Tasks

: Snowflake Documentation | Best Practices for Loading Data

: Snowflake Documentation | Using the Snowpipe REST API

: Snowflake Documentation | Scheduling Tasks

: SnowPro Advanced: Architect | Study Guide

: Creating Stages

: Loading Data Using Snowpipe

: Using Streams and Tasks for ELT

: [Creating Tasks]

: [Best Practices for Loading Data]

: [Using the Snowpipe REST API]

: [Scheduling Tasks]

Questions 5

A table for IOT devices that measures water usage is created. The table quickly becomes large and contains more than 2 billion rows.

The general query patterns for the table are:

1. DeviceId, lOT_timestamp and Customerld are frequently used in the filter predicate for the select statement

2. The columns City and DeviceManuf acturer are often retrieved

3. There is often a count on Uniqueld

Which field(s) should be used for the clustering key?

Options:

lOT_timestamp

City and DeviceManuf acturer

Deviceld and Customerld

Uniqueld

Buy Now

Questions 6

What Snowflake features should be leveraged when modeling using Data Vault?

Options:

Snowflake’s support of multi-table inserts into the data model’s Data Vault tables

Data needs to be pre-partitioned to obtain a superior data access performance

Scaling up the virtual warehouses will support parallel processing of new source loads

Snowflake’s ability to hash keys so that hash key joins can run faster than integer joins

Buy Now

Questions 7

When using the COPY INTO

command with the CSV file format, how does the MATCH_BY_COLUMN_NAME parameter behave?

Options:

It expects a header to be present in the CSV file, which is matched to a case-sensitive table column name.

The parameter will be ignored.

The command will return an error.

The command will return a warning stating that the file has unmatched columns.

Buy Now

Answer:

Explanation:

Comprehensive and Detailed Explanation From Exact Extract:

The MATCH_BY_COLUMN_NAME parameter in the COPY INTO

command is used to load semi-structured or structured data, such as CSV, into columns of the target table by matching column names in the data file with those in the table. For CSV files, this parameter requires specific conditions to be met, particularly the presence of a header row in the file, which is used to map columns to the target table.

According to the official Snowflake documentation, when the MATCH_BY_COLUMN_NAME parameter is used with CSV files, it is only supported in specific scenarios and requires the PARSE_HEADER file format option to be set to TRUE. This option indicates that the first row of the CSV file contains column headers, which Snowflake uses to match with the target table's column names. The matching behavior can be configured as CASE_SENSITIVE or CASE_INSENSITIVE, but the default behavior is case-sensitive unless specified otherwise.

However, there is a critical limitation when using MATCH_BY_COLUMN_NAME with CSV files: as of the latest Snowflake documentation, this feature is in Open Private Preview for CSV files and is not generally available for all accounts. When the MATCH_BY_COLUMN_NAME parameter is specified for a CSV file in an environment where this feature is not enabled, or if the PARSE_HEADER option is not set to TRUE, the COPY INTO command will return an error. This is because Snowflake cannot process the column name matching without the header parsing capability, which is not fully supported for CSV files in general availability.

The exact extract from the Snowflake documentation states:

"For loading CSV files, the MATCH_BY_COLUMN_NAME copy option is available in preview. It requires the use of the above-mentioned CSV file format option PARSE_HEADER = TRUE."

Additionally, the documentation clarifies:

"Boolean that specifies whether to use the first row headers in the data files to determine column names. This file format option is applied to the following actions only: Automatically detecting column definitions by using the INFER_SCHEMA function. Loading CSV data into separate columns by using the INFER_SCHEMA function and MATCH_BY_COLUMN_NAME copy option."

Furthermore, a known issue is noted:

"For CSV only, there is a known issue when the INCLUDE_METADATA copy option is used with MATCH_BY_COLUMN_NAME. Do not use this copy option when loading CSV files until the known issue is resolved."

Given that the MATCH_BY_COLUMN_NAME parameter is not fully supported for CSV files in general availability and requires specific preview conditions, attempting to use it without meeting those conditions, such as PARSE_HEADER = TRUE or enabling the preview feature, results in an error. Therefore, option C is correct: The command will return an error.

Option A is incorrect because, while MATCH_BY_COLUMN_NAME expects a header in the CSV file for matching when the feature is enabled, the case-sensitive matching is only true when explicitly set to CASE_SENSITIVE. Additionally, the feature's limited availability means it is not guaranteed to work without causing an error. Option B is incorrect because the parameter is not simply ignored; it triggers an error if the conditions are not met. Option D is incorrect because Snowflake does not issue a warning for unmatched columns in this context; it fails with an error when the parameter is unsupported or misconfigured.

[References:, Snowflake Documentation: COPY INTO, ,  Snowflake Documentation: Transforming Data During a Load,  Stack Overflow: COPY INTO Snowflake Table with Extra Columns, ]

Questions 8

A company has a source system that provides JSON records for various loT operations. The JSON Is loading directly into a persistent table with a variant field. The data Is quickly growing to 100s of millions of records and performance to becoming an issue. There is a generic access pattern that Is used to filter on the create_date key within the variant field.

What can be done to improve performance?

Options:

Alter the target table to Include additional fields pulled from the JSON records. This would Include a create_date field with a datatype of time stamp. When this field Is used in the filter, partition pruning will occur.

Alter the target table to include additional fields pulled from the JSON records. This would include a create_date field with a datatype of varchar. When this field is used in the filter, partition pruning will occur.

Validate the size of the warehouse being used. If the record count is approaching 100s of millions, size XL will be the minimum size required to process this amount of data.

Incorporate the use of multiple tables partitioned by date ranges. When a user or process needs to query a particular date range, ensure the appropriate base table Is used.

Buy Now

Answer:

Explanation:

The correct answer is A because it improves the performance of queries by reducing the amount of data scanned and processed. By adding a create_date field with a timestamp data type, Snowflake can automatically cluster the table based on this field and prune the micro-partitions that do not match the filter condition. This avoids the need to parse the JSON data and access the variant field for every record.

Option B is incorrect because it does not improve the performance of queries. By adding a create_date field with a varchar data type, Snowflake cannot automatically cluster the table based on this field and prune the micro-partitions that do not match the filter condition. This still requires parsing the JSON data and accessing the variant field for every record.

Option C is incorrect because it does not address the root cause of the performance issue. By validating the size of the warehouse being used, Snowflake can adjust the compute resources to match the data volume and parallelize the query execution. However, this does not reduce the amount of data scanned and processed, which is the main bottleneck for queries on JSON data.

Option D is incorrect because it adds unnecessary complexity and overhead to the data loading and querying process. By incorporating the use of multiple tables partitioned by date ranges, Snowflake can reduce the amount of data scanned and processed for queries that specify a date range. However, this requires creating and maintaining multiple tables, loading data into the appropriate table based on the date, and joining the tables for queries that span multiple date ranges. References:

Snowflake Documentation: Loading Data Using Snowpipe: This document explains how to use Snowpipe to continuously load data from external sources into Snowflake tables. It also describes the syntax and usage of the COPY INTO command, which supports various options and parameters to control the loading behavior, such as ON_ERROR, PURGE, and SKIP_FILE.

Snowflake Documentation: Date and Time Data Types and Functions: This document explains the different data types and functions for working with date and time values in Snowflake. It also describes how to set and change the session timezone and the system timezone.

Snowflake Documentation: Querying Metadata: This document explains how to query the metadata of the objects and operations in Snowflake using various functions, views, and tables. It also describes how to access the copy history information using the COPY_HISTORY function or the COPY_HISTORY view.

Snowflake Documentation: Loading JSON Data: This document explains how to load JSON data into Snowflake tables using various methods, such as the COPY INTO command, the INSERT command, or the PUT command. It also describes how to access and query JSON data using the dot notation, the FLATTEN function, or the LATERAL join.

Snowflake Documentation: Optimizing Storage for Performance: This document explains how to optimize the storage of data in Snowflake tables to improve the performance of queries. It also describes the concepts and benefits of automatic clustering, search optimization service, and materialized views.

Questions 9

Data is being imported and stored as JSON in a VARIANT column. Query performance was fine, but most recently, poor query performance has been reported.

What could be causing this?

Options:

There were JSON nulls in the recent data imports.

The order of the keys in the JSON was changed.

The recent data imports contained fewer fields than usual.

There were variations in string lengths for the JSON values in the recent data imports.

Buy Now

Answer:

Explanation:

Data is being imported and stored as JSON in a VARIANT column. Query performance was fine, but most recently, poor query performance has been reported. This could be caused by the following factors:

The order of the keys in the JSON was changed. Snowflake stores semi-structured data internally in a column-like structure for the most common elements, and the remainder in a leftovers-like column. The order of the keys in the JSON affects how Snowflake determines the common elements and how it optimizes the query performance. If the order of the keys in the JSON was changed, Snowflake might have to re-parse the data and re-organize the internal storage, which could result in slower query performance.

There were variations in string lengths for the JSON values in the recent data imports. Non-native values, such as dates and timestamps, are stored as strings when loaded into a VARIANT column. Operations on these values could be slower and also consume more space than when stored in a relational column with the corresponding data type. If there were variations in string lengths for the JSON values in the recent data imports, Snowflake might have to allocate more space and perform more conversions, which could also result in slower query performance.

The other options are not valid causes for poor query performance:

There were JSON nulls in the recent data imports. Snowflake supports two types of null values in semi-structured data: SQL NULL and JSON null. SQL NULL means the value is missing or unknown, while JSON null means the value is explicitly set to null. Snowflake can distinguish between these two types of null values and handle them accordingly. Having JSON nulls in the recent data imports should not affect the query performance significantly.

The recent data imports contained fewer fields than usual. Snowflake can handle semi-structured data with varying schemas and fields. Having fewer fields than usual in the recent data imports should not affect the query performance significantly, as Snowflake can still optimize the data ingestion and query execution based on the existing fields.

Considerations for Semi-structured Data Stored in VARIANT

Snowflake Architect Training

Snowflake query performance on unique element in variant column

Snowflake variant performance

Questions 10

An Architect has been asked to clone schema STAGING as it looked one week ago, Tuesday June 1st at 8:00 AM, to recover some objects.

The STAGING schema has 50 days of retention.

The Architect runs the following statement:

CREATE SCHEMA STAGING_CLONE CLONE STAGING at (timestamp => '2021-06-01 08:00:00');

The Architect receives the following error: Time travel data is not available for schema STAGING. The requested time is either beyond the allowed time travel period or before the object creation time.

The Architect then checks the schema history and sees the following:

CREATED_ON|NAME|DROPPED_ON

2021-06-02 23:00:00 | STAGING | NULL

2021-05-01 10:00:00 | STAGING | 2021-06-02 23:00:00

How can cloning the STAGING schema be achieved?

Options:

Undrop the STAGING schema and then rerun the CLONE statement.

Modify the statement: CREATE SCHEMA STAGING_CLONE CLONE STAGING at (timestamp => '2021-05-01 10:00:00');

Rename the STAGING schema and perform an UNDROP to retrieve the previous STAGING schema version, then run the CLONE statement.

Cloning cannot be accomplished because the STAGING schema version was not active during the proposed Time Travel time period.

Buy Now

Questions 11

Database DB1 has schema S1 which has one table, T1.

DB1 --> S1 --> T1

The retention period of EG1 is set to 10 days.

The retention period of s: is set to 20 days.

The retention period of t: Is set to 30 days.

The user runs the following command:

Drop Database DB1;

What will the Time Travel retention period be for T1?

Options:

10 days

20 days

30 days

37 days

Buy Now

Questions 12

When loading data into a table that captures the load time in a column with a default value of either CURRENT_TIME () or CURRENT_TIMESTAMP() what will occur?

Options:

All rows loaded using a specific COPY statement will have varying timestamps based on when the rows were inserted.

Any rows loaded using a specific COPY statement will have varying timestamps based on when the rows were read from the source.

Any rows loaded using a specific COPY statement will have varying timestamps based on when the rows were created in the source.

All rows loaded using a specific COPY statement will have the same timestamp value.

Buy Now

Questions 13

A company is designing a process for importing a large amount of loT JSON data from cloud storage into Snowflake. New sets of loT data get generated and uploaded approximately every 5 minutes.

Once the loT data is in Snowflake, the company needs up-to-date information from an external vendor to join to the data. This data is then presented to users through a dashboard that shows different levels of aggregation. The external vendor is a Snowflake customer.

What solution will MINIMIZE complexity and MAXIMIZE performance?

Options:

1. Create an external table over the JSON data in cloud storage.2. Create a task that runs every 5 minutes to run a transformation procedure on new data, based on a saved timestamp.3. Ask the vendor to expose an API so an external function can be used to generate a call to join the data back to the loT data in the transformation procedure.4. Give the transformed table access to the dashboard tool.5. Perform the aggregations on the dashboard

1. Create an external table over the JSON data in cloud storage.2. Create a task that runs every 5 minutes to run a transformation procedure on new data based on a saved timestamp.3. Ask the vendor to create a data share with the required data that can be imported into the company's Snowflake account.4. Join the vendor's data back to the loT data using a transformation procedure.5. Create views over the larger dataset to perform the aggrega

1. Create a Snowpipe to bring the JSON data into Snowflake.2. Use streams and tasks to trigger a transformation procedure when new JSON data arrives.3. Ask the vendor to expose an API so an external function call can be made to join the vendor's data back to the loT data in a transformation procedure.4. Create materialized views over the larger dataset to perform the aggregations required by the dashboard.5. Give the materialized views acce

1. Create a Snowpipe to bring the JSON data into Snowflake.2. Use streams and tasks to trigger a transformation procedure when new JSON data arrives.3. Ask the vendor to create a data share with the required data that is then imported into the Snowflake account.4. Join the vendor's data back to the loT data in a transformation procedure5. Create materialized views over the larger dataset to perform the aggregations required by the dashboard

Buy Now

Questions 14

Files arrive in an external stage every 10 seconds from a proprietary system. The files range in size from 500 K to 3 MB. The data must be accessible by dashboards as soon as it arrives.

How can a Snowflake Architect meet this requirement with the LEAST amount of coding? (Choose two.)

Options:

Use Snowpipe with auto-ingest.

Use a COPY command with a task.

Use a materialized view on an external table.

Use the COPY INTO command.

Use a combination of a task and a stream.

Buy Now

Questions 15

An Architect Is designing a data lake with Snowflake. The company has structured, semi-structured, and unstructured data. The company wants to save the data inside the data lake within the Snowflake system. The company is planning on sharing data among Its corporate branches using Snowflake data sharing.

What should be considered when sharing the unstructured data within Snowflake?

Options:

A pre-signed URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with no time limit for the URL.

A scoped URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with a 24-hour time limit for the URL.

A file URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with a 7-day time limit for the URL.

A file URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with the "expiration_time" argument defined for the URL time limit.

Buy Now

Questions 16

At which object type level can the APPLY MASKING POLICY, APPLY ROW ACCESS POLICY and APPLY SESSION POLICY privileges be granted?

Options:

Global

Database

Schema

Table

Buy Now

Questions 17

What considerations need to be taken when using database cloning as a tool for data lifecycle management in a development environment? (Select TWO).

Options:

Any pipes in the source are not cloned.

Any pipes in the source referring to internal stages are not cloned.

Any pipes in the source referring to external stages are not cloned.

The clone inherits all granted privileges of all child objects in the source object, including the database.

The clone inherits all granted privileges of all child objects in the source object, excluding the database.

Buy Now

Questions 18

What are characteristics of the use of transactions in Snowflake? (Select TWO).

Options:

Explicit transactions can contain DDL, DML, and query statements.

The autocommit setting can be changed inside a stored procedure.

A transaction can be started explicitly by executing a BEGIN WORK statement and ended explicitly by executing a COMMIT WORK statement.

A transaction can be started explicitly by executing a BEGIN TRANSACTION statement and ended explicitly by executing an END TRANSACTION statement.

Explicit transactions should contain only DML statements and query statements. All DDL statements implicitly commit active transactions.

Buy Now

Questions 19

Which columns can be included in an external table schema? (Select THREE).

Options:

VALUE

METADATASROW_ID

METADATASISUPDATE

METADAT A$ FILENAME

METADATAS FILE_ROW_NUMBER

METADATASEXTERNAL TABLE PARTITION

Buy Now

Questions 20

The IT Security team has identified that there is an ongoing credential stuffing attack on many of their organization’s system.

What is the BEST way to find recent and ongoing login attempts to Snowflake?

Options:

Call the LOGIN_HISTORY Information Schema table function.

Query the LOGIN_HISTORY view in the ACCOUNT_USAGE schema in the SNOWFLAKE database.

View the History tab in the Snowflake UI and set up a filter for SQL text that contains the text "LOGIN".

View the Users section in the Account tab in the Snowflake UI and review the last login column.

Buy Now

Questions 21

A new table and streams are created with the following commands:

CREATE OR REPLACE TABLE LETTERS (ID INT, LETTER STRING) ;

CREATE OR REPLACE STREAM STREAM_1 ON TABLE LETTERS;

CREATE OR REPLACE STREAM STREAM_2 ON TABLE LETTERS APPEND_ONLY = TRUE;

The following operations are processed on the newly created table:

INSERT INTO LETTERS VALUES (1, 'A');

INSERT INTO LETTERS VALUES (2, 'B');

INSERT INTO LETTERS VALUES (3, 'C');

TRUNCATE TABLE LETTERS;

INSERT INTO LETTERS VALUES (4, 'D');

INSERT INTO LETTERS VALUES (5, 'E');

INSERT INTO LETTERS VALUES (6, 'F');

DELETE FROM LETTERS WHERE ID = 6;

What would be the output of the following SQL commands, in order?

SELECT COUNT (*) FROM STREAM_1;

SELECT COUNT (*) FROM STREAM_2;

Options:

2 & 6

2 & 3

4 & 3

4 & 6

Buy Now

Questions 22

Which of the following are characteristics of how row access policies can be applied to external tables? (Choose three.)

Options:

An external table can be created with a row access policy, and the policy can be applied to the VALUE column.

A row access policy can be applied to the VALUE column of an existing external table.

A row access policy cannot be directly added to a virtual column of an external table.

External tables are supported as mapping tables in a row access policy.

While cloning a database, both the row access policy and the external table will be cloned.

A row access policy cannot be applied to a view created on top of an external table.

Buy Now

Answer:

A, B, C

Explanation:

These three statements are true according to the Snowflake documentation and the web search results. A row access policy is a feature that allows filtering rows based on user-defined conditions. A row access policy can be applied to an external table, which is a table that reads data from external files in a stage. However, there are some limitations and considerations for using row access policies with external tables.

An external table can be created with a row access policy by using the WITH ROW ACCESS POLICY clause in the CREATE EXTERNAL TABLE statement. The policy can be applied to theVALUE column, which is the column that contains the raw data from the external files in a VARIANT data type1.

A row access policy can also be applied to the VALUE column of an existing external table by using the ALTER TABLE statement with the SET ROW ACCESS POLICY clause2.

A row access policy cannot be directly added to a virtual column of an external table. A virtual column is a column that is derived from the VALUE column using an expression. To apply a row access policy to a virtual column, the policy must be applied to the VALUE column and the expression must be repeated in the policy definition3.

External tables are not supported as mapping tables in a row access policy. A mapping table is a table that is used to determine the access rights of users or roles based on some criteria. Snowflake does not support using an external table as a mapping table because it may cause performance issues or errors4.

While cloning a database, Snowflake clones the row access policy, but not the external table. Therefore, the policy in the cloned database refers to a table that is not present in the cloned database. To avoid this issue, the external table must be manually cloned or recreated in the cloned database4.

A row access policy can be applied to a view created on top of an external table. The policy can be applied to the view itself or to the underlying external table. However, if the policy is applied to the view, the view must be a secure view, which is a view that hides the underlying data and the view definition from unauthorized users5.

CREATE EXTERNAL TABLE | Snowflake Documentation

ALTER EXTERNAL TABLE | Snowflake Documentation

Understanding Row Access Policies | Snowflake Documentation

Snowflake Data Governance: Row Access Policy Overview

Secure Views | Snowflake Documentation

Questions 23

A retailer's enterprise data organization is exploring the use of Data Vault 2.0 to model its data lake solution. A Snowflake Architect has been asked to provide recommendations for using Data Vault 2.0 on Snowflake.

What should the Architect tell the data organization? (Select TWO).

Options:

Change data capture can be performed using the Data Vault 2.0 HASH_DIFF concept.

Change data capture can be performed using the Data Vault 2.0 HASH_DELTA concept.

Using the multi-table insert feature in Snowflake, multiple Point-in-Time (PIT) tables can be loaded in parallel from a single join query from the data vault.

Using the multi-table insert feature, multiple Point-in-Time (PIT) tables can be loaded sequentially from a single join query from the data vault.

There are performance challenges when using Snowflake to load multiple Point-in-Time (PIT) tables in parallel from a single join query from the data vault.

Buy Now

Questions 24

The following table exists in the production database:

A regulatory requirement states that the company must mask the username for events that are older than six months based on the current date when the data is queried.

How can the requirement be met without duplicating the event data and making sure it is applied when creating views using the table or cloning the table?

Options:

Use a masking policy on the username column using a entitlement table with valid dates.

Use a row level policy on the user_events table using a entitlement table with valid dates.

Use a masking policy on the username column with event_timestamp as a conditional column.

Use a secure view on the user_events table using a case statement on the username column.

Buy Now

Answer:

Explanation:

A masking policy is a feature of Snowflake that allows masking sensitive data in query results based on the role of the user and the condition of the data. A masking policy can be applied to a column in a table or a view, and it can use another column in the same table or view as a conditional column. A conditional column is a column that determines whether the masking policy is applied or not based on its value1.

In this case, the requirement can be met by using a masking policy on the username column with event_timestamp as a conditional column. The masking policy can use a function that masks the username if the event_timestamp is older than six months based on the current date, and returns the original username otherwise. The masking policy canbe applied to the user_events table, and it will also be applied when creating views using the table or cloning the table2.

The other options are not correct because:

A. Using a masking policy on the username column using an entitlement table with valid dates would require creating another table that stores the valid dates for each username, and joining it with the user_events table in the masking policy function. This would add complexity and overhead to the masking policy, and it would not use the event_timestamp column as the condition for masking.

B. Using a row level policy on the user_events table using an entitlement table with valid dates would require creating another table that stores the valid dates for each username, and joining it with the user_events table in the row access policy function. This would filter out the rows that have event_timestamp older than six months based on the valid dates, instead of masking the username column. This would not meet the requirement of masking the username, and it would also reduce the visibility of the event data.

D. Using a secure view on the user_events table using a case statement on the username column would require creating a view that uses a case expression to mask the username column based on the event_timestamp column. This would meet the requirement of masking the username, but it would not be applied when cloning the table. A secure view is a view that prevents the underlying data from being exposed by queries on the view. However, a secure view does not prevent the underlying data from being exposed by cloning the table3.

1: Masking Policies | Snowflake Documentation

2: Using Conditional Columns in Masking Policies | Snowflake Documentation

3: Secure Views | Snowflake Documentation

Questions 25

What built-in Snowflake features make use of the change tracking metadata for a table? (Choose two.)

Options:

The MERGE command

The UPSERT command

The CHANGES clause

A STREAM object

The CHANGE_DATA_CAPTURE command

Buy Now

Questions 26

Which security, governance, and data protection features require, at a MINIMUM, the Business Critical edition of Snowflake? (Choose two.)

Options:

Extended Time Travel (up to 90 days)

Customer-managed encryption keys through Tri-Secret Secure

Periodic rekeying of encrypted data

AWS, Azure, or Google Cloud private connectivity to Snowflake

Federated authentication and SSO

Buy Now

Questions 27

Based on the architecture in the image, how can the data from DB1 be copied into TBL2? (Select TWO).

Options:

Option A

Option B

Option C

Option D

Option E

Buy Now

Answer:

B, E

Explanation:

The architecture in the image shows a Snowflake data platform with two databases, DB1 and DB2, and two schemas, SH1 and SH2. DB1 contains a table TBL1 and a stage STAGE1. DB2 contains a table TBL2. The image also shows a snippet of code written in SQL language that copies data from STAGE1 to TBL2 using a file format FF PIPE 1.

To copy data from DB1 to TBL2, there are two possible options among the choices given:

Option B: Use a named external stage that references STAGE1. This option requires creating an external stage object in DB2.SH2 that points to the same location as STAGE1 in DB1.SH1. The external stage can be created using the CREATE STAGE command with the URL parameter specifying the location of STAGE11. For example:

SQLAI-generated code. Review and use carefully. More info on FAQ.

use database DB2;

use schema SH2;

createstage EXT_STAGE1

url=@DB1.SH1.STAGE1;

Then, the data can be copied from the external stage to TBL2 using the COPY INTO command with the FROM parameter specifying the external stage name and the FILE FORMAT parameter specifying the file format name2. For example:

SQLAI-generated code. Review and use carefully. More info on FAQ.

copyintoTBL2

from@EXT_STAGE1

file format=(format name=DB1.SH1.FF PIPE1);

Option E: Use a cross-database query to select data from TBL1 and insert into TBL2. This option requires using the INSERT INTO command with the SELECT clause to query data from TBL1 in DB1.SH1 and insert it into TBL2 in DB2.SH2. The query must use the fully-qualified names of the tables, including the database and schema names3. For example:

SQLAI-generated code. Review and use carefully. More info on FAQ.

use database DB2;

use schema SH2;

insertintoTBL2

select*fromDB1.SH1.TBL1;

The other options are not valid because:

Option A: It uses an invalid syntax for the COPY INTO command. The FROM parameter cannot specify a table name, only a stage name or a file location2.

Option C: It uses an invalid syntax for the COPY INTO command. The FILE FORMAT parameter cannot specify a stage name, only a file format name or options2.

Option D: It uses an invalid syntax for the CREATE STAGE command. The URL parameter cannot specify a table name, only a file location1.

1: CREATE STAGE | Snowflake Documentation

2: COPY INTO table | Snowflake Documentation

3: Cross-database Queries | Snowflake Documentation

Questions 28

An Architect is designing a file ingestion recovery solution. The project will use an internal named stage for file storage. Currently, in the case of an ingestion failure, the Operations team must manually download the failed file and check for errors.

Which downloading method should the Architect recommend that requires the LEAST amount of operational overhead?

Options:

Use the Snowflake Connector for Python, connect to remote storage and download the file.

Use the get command in SnowSQL to retrieve the file.

Use the get command in Snowsight to retrieve the file.

Use the Snowflake API endpoint and download the file.

Buy Now

Questions 29

A company has an external vendor who puts data into Google Cloud Storage. The company's Snowflake account is set up in Azure.

What would be the MOST efficient way to load data from the vendor into Snowflake?

Options:

Ask the vendor to create a Snowflake account, load the data into Snowflake and create a data share.

Create an external stage on Google Cloud Storage and use the external table to load the data into Snowflake.

Copy the data from Google Cloud Storage to Azure Blob storage using external tools and load data from Blob storage to Snowflake.

Create a Snowflake Account in the Google Cloud Platform (GCP), ingest data into this account and use data replication to move the data from GCP to Azure.

Buy Now

Questions 30

There are two databases in an account, named fin_db and hr_db which contain payroll and employee data, respectively. Accountants and Analysts in the company require different permissions on the objects in these databases to perform their jobs. Accountants need read-write access to fin_db but only require read-only access to hr_db because the database is maintained by human resources personnel.

An Architect needs to create a read-only role for certain employees working in the human resources department.

Which permission sets must be granted to this role?

Options:

USAGE on database hr_db, USAGE on all schemas in database hr_db, SELECT on all tables in database hr_db

USAGE on database hr_db, SELECT on all schemas in database hr_db, SELECT on all tables in database hr_db

MODIFY on database hr_db, USAGE on all schemas in database hr_db, USAGE on all tables in database hr_db

USAGE on database hr_db, USAGE on all schemas in database hr_db, REFERENCES on all tables in database hr_db

Buy Now

Questions 31

What are some of the characteristics of result set caches? (Choose three.)

Options:

Time Travel queries can be executed against the result set cache.

Snowflake persists the data results for 24 hours.

Each time persisted results for a query are used, a 24-hour retention period is reset.

The data stored in the result cache will contribute to storage costs.

The retention period can be reset for a maximum of 31 days.

The result set cache is not shared between warehouses.

Buy Now

Questions 32

A company is designing high availability and disaster recovery plans and needs to maximize redundancy and minimize recovery time objectives for their critical application processes. Cost is not a concern as long as the solution is the best available. The plan so far consists of the following steps:

1. Deployment of Snowflake accounts on two different cloud providers.

2. Selection of cloud provider regions that are geographically far apart.

3. The Snowflake deployment will replicate the databases and account data between both cloud provider accounts.

4. Implementation of Snowflake client redirect.

What is the MOST cost-effective way to provide the HIGHEST uptime and LEAST application disruption if there is a service event?

Options:

Connect the applications using the - URL. Use the Business Critical Snowflake edition.

Connect the applications using the - URL. Use the Virtual Private Snowflake (VPS) edition.

Connect the applications using the - URL. Use the Enterprise Snowflake edition.

Connect the applications using the - URL. Use the Business Critical Snowflake edition.

Buy Now

Questions 33

An Architect is using SnowCD to investigate a connectivity issue.

Which system function will provide a list of endpoints that the network must be able to access to use a specific Snowflake account, leveraging private connectivity?

Options:

SYSTEMSALLOWLIST ()

SYSTEMSGET_PRIVATELINK

SYSTEMSAUTHORIZE_PRIVATELINK

SYSTEMSALLOWLIST_PRIVATELINK ()

Buy Now

Questions 34

A table contains five columns and it has millions of records. The cardinality distribution of the columns is shown below:

Column C4 and C5 are mostly used by SELECT queries in the GROUP BY and ORDER BY clauses. Whereas columns C1, C2 and C3 are heavily used in filter and join conditions of SELECT queries.

The Architect must design a clustering key for this table to improve the query performance.

Based on Snowflake recommendations, how should the clustering key columns be ordered while defining the multi-column clustering key?

Options:

C5, C4, C2

C3, C4, C5

C1, C3, C2

C2, C1, C3

Buy Now

Questions 35

What is a characteristic of loading data into Snowflake using the Snowflake Connector for Kafka?

Options:

The Connector only works in Snowflake regions that use AWS infrastructure.

The Connector works with all file formats, including text, JSON, Avro, Ore, Parquet, and XML.

The Connector creates and manages its own stage, file format, and pipe objects.

Loads using the Connector will have lower latency than Snowpipe and will ingest data in real time.

Buy Now

Answer:

Explanation:

According to the SnowPro Advanced: Architect documents and learning resources, a characteristic of loading data into Snowflake using the Snowflake Connector for Kafka is that the Connector creates and manages its own stage, file format, and pipe objects. The stage is an internal stage that is used to store the data files from the Kafka topics. The file format is a JSON or Avro file format that is used to parse the data files. The pipe is a Snowpipe object that is used to load the data files into the Snowflake table. The Connector automatically creates and configures these objects based on the Kafka configuration properties, and handles the cleanup and maintenance of these objects1.

The other options are incorrect because they are not characteristics of loading data into Snowflake using the Snowflake Connector for Kafka. Option A is incorrect because the Connector works in Snowflake regions that use any cloud infrastructure, not just AWS. The Connector supports AWS, Azure, and Google Cloud platforms, and can load data across different regions and cloud platforms using data replication2. Option B is incorrect because the Connector does not work with all file formats, only JSON and Avro. The Connector expects the data in the Kafka topics to be in JSON or Avro format, and parses the data accordingly. Other file formats, such as text, ORC, Parquet, or XML, are not supported by the Connector3. Option D is incorrect because loads using the Connector do not have lower latency than Snowpipe, and do not ingest data in real time. The Connector uses Snowpipe to load data into Snowflake, and inherits the same latency and performance characteristics of Snowpipe. The Connector does not provide real-time ingestion, but near real-time ingestion, depending on the frequency and size of the data files4. References: Installing and Configuring the Kafka Connector | Snowflake Documentation, Sharing Data Across Regions and Cloud Platforms | Snowflake Documentation, Overview of the Kafka Connector | Snowflake Documentation, Using Snowflake Connector for Kafka With Snowpipe Streaming | Snowflake Documentation

Questions 36

Which technique will efficiently ingest and consume semi-structured data for Snowflake data lake workloads?

Options:

IDEF1X

Schema-on-write

Schema-on-read

Information schema

Buy Now

Questions 37

An Architect is troubleshooting a query with poor performance using the QUERY_HIST0RY function. The Architect observes that the COMPILATIONJHME is greater than the EXECUTIONJTIME.

What is the reason for this?

Options:

The query is processing a very large dataset.

The query has overly complex logic.

The query is queued for execution.

The query is reading from remote storage.

Buy Now

Questions 38

A user can change object parameters using which of the following roles?

Options:

ACCOUNTADMIN, SECURITYADMIN

SYSADMIN, SECURITYADMIN

ACCOUNTADMIN, USER with PRIVILEGE

SECURITYADMIN, USER with PRIVILEGE

Buy Now

Questions 39

How can the Snowpipe REST API be used to keep a log of data load history?

Options:

Call insertReport every 20 minutes, fetching the last 10,000 entries.

Call loadHistoryScan every minute for the maximum time range.

Call insertReport every 8 minutes for a 10-minute time range.

Call loadHistoryScan every 10 minutes for a 15-minute time range.

Buy Now

Questions 40

A healthcare company wants to share data with a medical institute. The institute is running a Standard edition of Snowflake; the healthcare company is running a Business Critical edition.

How can this data be shared?

Options:

The healthcare company will need to change the institute’s Snowflake edition in the accounts panel.

By default, sharing is supported from a Business Critical Snowflake edition to a Standard edition.

Contact Snowflake and they will execute the share request for the healthcare company.

Set the share_restriction parameter on the shared object to false.

Buy Now

Questions 41

You are a snowflake architect in an organization. The business team came to to deploy an use case which requires you to load some data which they can visualize through tableau. Everyday new data comes in and the old data is no longer required.

What type of table you will use in this case to optimize cost

Options:

TRANSIENT

TEMPORARY

PERMANENT

Buy Now

Questions 42

Which Snowflake architecture recommendation needs multiple Snowflake accounts for implementation?

Options:

Enable a disaster recovery strategy across multiple cloud providers.

Create external stages pointing to cloud providers and regions other than the region hosting the Snowflake account.

Enable zero-copy cloning among the development, test, and production environments.

Enable separation of the development, test, and production environments.

Buy Now

Questions 43

What is a characteristic of Role-Based Access Control (RBAC) as used in Snowflake?

Options:

Privileges can be granted at the database level and can be inherited by all underlying objects.

A user can use a "super-user" access along with securityadmin to bypass authorization checks and access all databases, schemas, and underlying objects.

A user can create managed access schemas to support future grants and ensure only schema owners can grant privileges to other roles.

A user can create managed access schemas to support current and future grants and ensure only object owners can grant privileges to other roles.

Buy Now

Answer:

Explanation:

Role-Based Access Control (RBAC) is the Snowflake Access Control Framework that allows privileges to be granted by object owners to roles, and roles, in turn, can be assigned to users to restrict or allow actions to be performed on objects. A characteristic of RBAC as used in Snowflake is:

Privileges can be granted at the database level and can be inherited by all underlying objects. This means that a role that has a certain privilege on a database, such as CREATE SCHEMA or USAGE, can also perform the same action on any schema, table, view, or other object within that database, unless explicitly revoked. This simplifies the access control management and reduces the number of grants required.

A user can create managed access schemas to support future grants and ensure only schema owners can grant privileges to other roles. This means that a user can create a schema with the MANAGED ACCESS option, which changes the default behavior of object ownership and privilege granting within the schema. In a managed access schema, object owners lose the ability to grant privileges on their objects to other roles, and only the schema owner or a role with the MANAGE GRANTS privilege can do so. This enhances the security and governance of the schema and its objects.

The other options are not characteristics of RBAC as used in Snowflake:

A user can use a “super-user” access along with securityadmin to bypass authorization checks and access all databases, schemas, and underlying objects. This is not true, as there is no such thing as a “super-user” access in Snowflake. The securityadmin role is a predefined role that can manage users and roles, but it does not have any privileges on any database objects by default. To access any object, the securityadmin role must be explicitly granted the appropriate privilege by the object owner or another role with the grant option.

A user can create managed access schemas to support current and future grants and ensure only object owners can grant privileges to other roles. This is not true, as this contradicts the definition of a managed access schema. In a managed access schema, object owners cannot grant privileges on their objects to other roles, and only the schema owner or a role with the MANAGE GRANTS privilege can do so.

Overview of Access Control

A Functional Approach For Snowflake’s Role-Based Access Controls

Snowflake Role-Based Access Control simplified

Snowflake RBAC security prefers role inheritance to role composition

Overview of Snowflake Role Based Access Control

Questions 44

A company is trying to Ingest 10 TB of CSV data into a Snowflake table using Snowpipe as part of Its migration from a legacy database platform. The records need to be ingested in the MOST performant and cost-effective way.

How can these requirements be met?

Options:

Use ON_ERROR = continue in the copy into command.

Use purge = TRUE in the copy into command.

Use FURGE = FALSE in the copy into command.

Use on error = SKIP_FILE in the copy into command.

Buy Now

Questions 45

An Architect needs to meet a company requirement to ingest files from the company's AWS storage accounts into the company's Snowflake Google Cloud Platform (GCP) account. How can the ingestion of these files into the company's Snowflake account be initiated? (Select TWO).

Options:

Configure the client application to call the Snowpipe REST endpoint when new files have arrived in Amazon S3 storage.

Configure the client application to call the Snowpipe REST endpoint when new files have arrived in Amazon S3 Glacier storage.

Create an AWS Lambda function to call the Snowpipe REST endpoint when new files have arrived in Amazon S3 storage.

Configure AWS Simple Notification Service (SNS) to notify Snowpipe when new files have arrived in Amazon S3 storage.

Configure the client application to issue a COPY INTO

command to Snowflake when new files have arrived in Amazon S3 Glacier storage.

Buy Now

Answer:

A, C

Explanation:

This question centers aroundcross-cloud ingestionusingSnowpipe, where the Snowflake account is onGoogle Cloud, but the source data resides inAmazon S3. Sinceautomatic Snowpipe event-based integration (using AWS SNS)is only supported when Snowflake is deployed onAWS, this limits the options.

In a multi-cloud scenario (e.g., S3 to GCP-hosted Snowflake), onlymanual triggering of Snowpipeis available, typically through theSnowpipe REST API.

Option A – Correct

You can configure your client application to call theSnowpipe REST APIwhen files are available. This is suitable and supported forcross-cloud ingestion, such as from AWS S3 into a Snowflake account hosted on GCP.

Official Extract:

"For cloud platforms other than AWS, or when using Snowpipe in a cross-cloud or external storage configuration, the REST API must be used to trigger ingestion."

Source:Snowflake Docs – Using Snowpipe REST API

Option C – Correct

AWS Lambda can be used toinvoke the Snowpipe REST APIwhen files arrive in S3. This is a common pattern in cross-cloud integrations, acting as a middle layer.

Official Extract:

"You can use an AWS Lambda function to call the Snowpipe REST API when new files are added to your S3 bucket. This pattern works for Snowflake accounts hosted outside of AWS."

Source:Snowflake Docs – Automating Snowpipe REST API Calls

Option B – Incorrect

S3 Glacier is an archival storage service and isnot supportedfor direct loading into Snowflake, since data is not immediately accessible.

Option D – Incorrect

UsingAWS SNS for auto-notificationis only supported whenSnowflake is deployed on AWS. Since this question specifies Snowflake onGCP, SNS-based integration isnot supported.

Official Extract:

"Snowpipe supports Amazon S3 event notifications (SNS/SQS) only for Snowflake accounts hosted on AWS."

Source:Snowflake Docs – Supported Cloud Platforms

Option E – Incorrect

Similar to Option B, S3 Glacier isnot a valid sourcefor data loading because it does not provide real-time file access.

[References:, Snowflake Documentation – Snowpipe Overview, Snowpipe REST API, Automating Snowpipe with Lambda, Automatic Ingestion Support – Cloud Limitations, , , , ]

Questions 46

A global company needs to securely share its sales and Inventory data with a vendor using a Snowflake account.

The company has its Snowflake account In the AWS eu-west 2 Europe (London) region. The vendor's Snowflake account Is on the Azure platform in the West Europe region. How should the company's Architect configure the data share?

Options:

1. Create a share.2. Add objects to the share.3. Add a consumer account to the share for the vendor to access.

1. Create a share.2. Create a reader account for the vendor to use.3. Add the reader account to the share.

1. Create a new role called db_share.2. Grant the db_share role privileges to read data from the company database and schema.3. Create a user for the vendor.4. Grant the ds_share role to the vendor's users.

1. Promote an existing database in the company's local account to primary.2. Replicate the database to Snowflake on Azure in the West-Europe region.3. Create a share and add objects to the share.4. Add a consumer account to the share for the vendor to access.

Buy Now

Questions 47

What integration object should be used to place restrictions on where data may be exported?

Options:

Stage integration

Security integration

Storage integration

API integration

Buy Now

Questions 48

How can the Snowpipe REST API be used to keep a log of data load history?

Options:

Call insertReport every 20 minutes, fetching the last 10,000 entries.

Call loadHistoryScan every minute for the maximum time range.

Call insertReport every 8 minutes for a 10-minute time range.

Call loadHistoryScan every 10 minutes for a 15-minutes range.

Buy Now

Answer:

Explanation:

The Snowpipe REST API provides two endpoints for retrieving the data load history: insertReport and loadHistoryScan. The insertReport endpoint returns the status of the files that were submitted to the insertFiles endpoint, while the loadHistoryScan endpoint returns the history of the files that were actually loaded into the table by Snowpipe. To keep a log of data load history, it is recommended to use the loadHistoryScan endpoint, which provides more accurate and complete information about the data ingestion process. The loadHistoryScan endpoint accepts a start time and an end time as parameters, and returns the files that were loaded within that time range. The maximum time range that can be specified is 15 minutes, and the maximum number of files that can be returned is 10,000. Therefore, to keep a log of data load history, the best option is to call the loadHistoryScan endpoint every 10 minutes for a 15-minute time range, and store the results in a log file or a table. This way, the log will capture all the files that were loaded by Snowpipe, and avoid any gaps or overlaps in the time range. The other options are incorrect because:

Calling insertReport every 20 minutes, fetching the last 10,000 entries, will not provide a complete log of data load history, as some files may be missed or duplicated due to the asynchronous nature of Snowpipe. Moreover, insertReport only returns the status of the files that were submitted, not the files that were loaded.

Calling loadHistoryScan every minute for the maximum time range will result in too many API calls and unnecessary overhead, as the same files will be returned multiple times. Moreover, the maximum time range is 15 minutes, not 1 minute.

Calling insertReport every 8 minutes for a 10-minute time range will suffer from the same problems as option A, and also create gaps or overlaps in the time range.

Snowpipe REST API

Option 1: Loading Data Using the Snowpipe REST API

PIPE_USAGE_HISTORY

Exam Code: ARA-C01

Exam Name: SnowPro Advanced: Architect Certification Exam

Last Update: Nov 30, 2025

Questions: 162

PDF + Testing Engine

$57.75 ~~$164.99~~

Testing Engine (only)

$43.75 ~~$124.99~~

PDF (only)

$36.75 ~~$104.99~~

Quick Links

Why Us

Site Secure

TESTED 30 Nov 2025

Black Friday Sale - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 65percent

dumpspedia logo

Navigation:

ARA-C01 Sample Questions Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation: