Labour Day Sale - Limited Time 60% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 575363r9

Welcome To DumpsPedia

Professional-Machine-Learning-Engineer Sample Questions Answers

Questions 4

You are developing a mode! to detect fraudulent credit card transactions. You need to prioritize detection because missing even one fraudulent transaction could severely impact the credit card holder. You used AutoML to tram a model on users' profile information and credit card transaction data. After training the initial model, you notice that the model is failing to detect many fraudulent transactions. How should you adjust the training parameters in AutoML to improve model performance?

Choose 2 answers

Options:

A.

Increase the score threshold.

B.

Decrease the score threshold.

C.

Add more positive examples to the training set.

D.

Add more negative examples to the training set.

E.

Reduce the maximum number of node hours for training.

Buy Now
Questions 5

You have trained a text classification model in TensorFlow using Al Platform. You want to use the trained model for batch predictions on text data stored in BigQuery while minimizing computational overhead. What should you do?

Options:

A.

Export the model to BigQuery ML.

B.

Deploy and version the model on Al Platform.

C.

Use Dataflow with the SavedModel to read the data from BigQuery

D.

Submit a batch prediction job on Al Platform that points to the model location in Cloud Storage.

Buy Now
Questions 6

You are working on a Neural Network-based project. The dataset provided to you has columns with different ranges. While preparing the data for model training, you discover that gradient optimization is having difficulty moving weights to a good solution. What should you do?

Options:

A.

Use feature construction to combine the strongest features.

B.

Use the representation transformation (normalization) technique.

C.

Improve the data cleaning step by removing features with missing values.

D.

Change the partitioning step to reduce the dimension of the test set and have a larger training set.

Buy Now
Questions 7

You have been asked to build a model using a dataset that is stored in a medium-sized (~10 GB) BigQuery table. You need to quickly determine whether this data is suitable for model development. You want to create a one-time report that includes both informative visualizations of data distributions and more sophisticated statistical analyses to share with other ML engineers on your team. You require maximum flexibility to create your report. What should you do?

Options:

A.

Use Vertex AI Workbench user-managed notebooks to generate the report.

B.

Use the Google Data Studio to create the report.

C.

Use the output from TensorFlow Data Validation on Dataflow to generate the report.

D.

Use Dataprep to create the report.

Buy Now
Questions 8

You work for a pharmaceutical company based in Canada. Your team developed a BigQuery ML model to predict the number of flu infections for the next month in Canada Weather data is published weekly and flu infection statistics are published monthly. You need to configure a model retraining policy that minimizes cost What should you do?

Options:

A.

Download the weather and flu data each week Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model weekly.

B.

Download the weather and flu data each month Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model monthly.

C.

Download the weather and flu data each week Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model every month.

D.

Download the weather data each week, and download the flu data each month Deploy the model to a Vertex Al endpoint with feature drift monitoring. and retrain the model if a monitoring alert is detected.

Buy Now
Questions 9

Your organization manages an online message board A few months ago, you discovered an increase in toxic language and bullying on the message board. You deployed an automated text classifier that flags certain comments as toxic or harmful. Now some users are reporting that benign comments referencing their religion are being misclassified as abusive Upon further inspection, you find that your classifier's false positive rate is higher for comments that reference certain underrepresented religious groups. Your team has a limited budget and is already overextended. What should you do?

Options:

A.

Add synthetic training data where those phrases are used in non-toxic ways

B.

Remove the model and replace it with human moderation.

C.

Replace your model with a different text classifier.

D.

Raise the threshold for comments to be considered toxic or harmful

Buy Now
Questions 10

You work for a large retailer and you need to build a model to predict customer churn. The company has a dataset of historical customer data, including customer demographics, purchase history, and website activity. You need to create the model in BigQuery ML and thoroughly evaluate its performance. What should you do?

Options:

A.

Create a linear regression model in BigQuery ML and register the model in Vertex Al Model Registry Evaluate the model performance in Vertex Al.

B.

Create a logistic regression model in BigQuery ML and register the model in Vertex Al Model Registry. Evaluate the model performance in Vertex Al.

C.

Create a linear regression model in BigQuery ML Use the ml. evaluate function to evaluate the model performance.

D.

Create a logistic regression model in BigQuery ML Use the ml.confusion_matrix function to evaluate the model performance.

Buy Now
Questions 11

You are using Keras and TensorFlow to develop a fraud detection model Records of customer transactions are stored in a large table in BigQuery. You need to preprocess these records in a cost-effective and efficient way before you use them to train the model. The trained model will be used to perform batch inference in BigQuery. How should you implement the preprocessing workflow?

Options:

A.

Implement a preprocessing pipeline by using Apache Spark, and run the pipeline on Dataproc Save the preprocessed data as CSV files in a Cloud Storage bucket.

B.

Load the data into a pandas DataFrame Implement the preprocessing steps using panda’s transformations. and train the model directly on the DataFrame.

C.

Perform preprocessing in BigQuery by using SQL Use the BigQueryClient in TensorFlow to read the data directly from BigQuery.

D.

Implement a preprocessing pipeline by using Apache Beam, and run the pipeline on Dataflow Save the preprocessed data as CSV files in a Cloud Storage bucket.

Buy Now
Questions 12

You are an ML engineer in the contact center of a large enterprise. You need to build a sentiment analysis tool that predicts customer sentiment from recorded phone conversations. You need to identify the best approach to building a model while ensuring that the gender, age, and cultural differences of the customers who called the contact center do not impact any stage of the model development pipeline and results. What should you do?

Options:

A.

Extract sentiment directly from the voice recordings

B.

Convert the speech to text and build a model based on the words

C.

Convert the speech to text and extract sentiments based on the sentences

D.

Convert the speech to text and extract sentiment using syntactical analysis

Buy Now
Questions 13

You developed a Python module by using Keras to train a regression model. You developed two model architectures, linear regression and deep neural network (DNN). within the same module. You are using the – raining_method argument to select one of the two methods, and you are using the Learning_rate-and num_hidden_layers arguments in the DNN. You plan to use Vertex Al's hypertuning service with a Budget to perform 100 trials. You want to identify the model architecture and hyperparameter values that minimize training loss and maximize model performance What should you do?

Options:

A.

Run one hypertuning job for 100 trials. Set num hidden_layers as a conditional hypetparameter based on its parent hyperparameter training_mothod. and set learning rate as a non-conditional hyperparameter

B.

Run two separate hypertuning jobs. a linear regression job for 50 trials, and a DNN job for 50 trials Compare their final performance on a

common validation set. and select the set of hyperparameters with the least training loss

C.

Run one hypertuning job for 100 trials Set num_hidden_layers and learning_rate as conditional hyperparameters based on their parent hyperparameter training method.

D.

Run one hypertuning job with training_method as the hyperparameter for 50 trials Select the architecture with the lowest training loss. and further hypertune It and its corresponding hyperparameters for 50 trials

Buy Now
Questions 14

You recently used XGBoost to train a model in Python that will be used for online serving Your model prediction service will be called by a backend service implemented in Golang running on a Google Kubemetes Engine (GKE) cluster Your model requires pre and postprocessing steps You need to implement the processing steps so that they run at serving time You want to minimize code changes and infrastructure maintenance and deploy your model into production as quickly as possible. What should you do?

Options:

A.

Use FastAPI to implement an HTTP server Create a Docker image that runs your HTTP server and deploy it on your organization's GKE cluster.

B.

Use FastAPI to implement an HTTP server Create a Docker image that runs your HTTP server Upload the image to Vertex Al Model Registry and deploy it to a Vertex Al endpoint.

C.

Use the Predictor interface to implement a custom prediction routine Build the custom contain upload the container to Vertex Al Model Registry, and deploy it to a Vertex Al endpoint.

D.

Use the XGBoost prebuilt serving container when importing the trained model into Vertex Al Deploy the model to a Vertex Al endpoint Work with the backend engineers to implement the pre- and postprocessing steps in the Golang backend service.

Buy Now
Questions 15

You have created a Vertex Al pipeline that automates custom model training You want to add a pipeline component that enables your team to most easily collaborate when running different executions and comparing metrics both visually and programmatically. What should you do?

Options:

A.

Add a component to the Vertex Al pipeline that logs metrics to a BigQuery table Query the table to compare different executions of the pipeline Connect BigQuery to Looker Studio to visualize metrics.

B.

Add a component to the Vertex Al pipeline that logs metrics to a BigQuery table Load the table into a pandas DataFrame to compare different executions of the pipeline Use Matplotlib to visualize metrics.

C.

Add a component to the Vertex Al pipeline that logs metrics to Vertex ML Metadata Use Vertex Al Experiments to compare different executions of the pipeline Use Vertex Al TensorBoard to visualize metrics.

D.

Add a component to the Vertex Al pipeline that logs metrics to Vertex ML Metadata Load the Vertex ML Metadata into a pandas DataFrame to compare different executions of the pipeline. Use Matplotlib to visualize metrics.

Buy Now
Questions 16

You have been given a dataset with sales predictions based on your company’s marketing activities. The data is structured and stored in BigQuery, and has been carefully managed by a team of data analysts. You need to prepare a report providing insights into the predictive capabilities of the data. You were asked to run several ML models with different levels of sophistication, including simple models and multilayered neural networks. You only have a few hours to gather the results of your experiments. Which Google Cloud tools should you use to complete this task in the most efficient and self-serviced way?

Options:

A.

Use BigQuery ML to run several regression models, and analyze their performance.

B.

Read the data from BigQuery using Dataproc, and run several models using SparkML.

C.

Use Vertex AI Workbench user-managed notebooks with scikit-learn code for a variety of ML algorithms and performance metrics.

D.

Train a custom TensorFlow model with Vertex AI, reading the data from BigQuery featuring a variety of ML algorithms.

Buy Now
Questions 17

You are developing an ML model intended to classify whether X-Ray images indicate bone fracture risk. You have trained on Api Resnet architecture on Vertex AI using a TPU as an accelerator, however you are unsatisfied with the trainning time and use memory usage. You want to quickly iterate your training code but make minimal changes to the code. You also want to minimize impact on the models accuracy. What should you do?

Options:

A.

Configure your model to use bfloat16 instead float32

B.

Reduce the global batch size from 1024 to 256

C.

Reduce the number of layers in the model architecture

D.

Reduce the dimensions of the images used un the model

Buy Now
Questions 18

You work for a global footwear retailer and need to predict when an item will be out of stock based on historical inventory data. Customer behavior is highly dynamic since footwear demand is influenced by many different factors. You want to serve models that are trained on all available data, but track your performance on specific subsets of data before pushing to production. What is the most streamlined and reliable way to perform this validation?

Options:

A.

Use the TFX ModelValidator tools to specify performance metrics for production readiness

B.

Use k-fold cross-validation as a validation strategy to ensure that your model is ready for production.

C.

Use the last relevant week of data as a validation set to ensure that your model is performing accurately on current data

D.

Use the entire dataset and treat the area under the receiver operating characteristics curve (AUC ROC) as the main metric.

Buy Now
Questions 19

You work for a retailer that sells clothes to customers around the world. You have been tasked with ensuring that ML models are built in a secure manner. Specifically, you need to protect sensitive customer data that might be used in the models. You have identified four fields containing sensitive data that are being used by your data science team: AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and SHIRT_SIZE. What should you do with the data before it is made available to the data science team for training purposes?

Options:

A.

Tokenize all of the fields using hashed dummy values to replace the real values.

B.

Use principal component analysis (PCA) to reduce the four sensitive fields to one PCA vector.

C.

Coarsen the data by putting AGE into quantiles and rounding LATITUDE_LONGTTUDE into single precision. The other two fields are already as coarse as possible.

D.

Remove all sensitive data fields, and ask the data science team to build their models using non-sensitive data.

Buy Now
Questions 20

You are developing an image recognition model using PyTorch based on ResNet50 architecture. Your code is working fine on your local laptop on a small subsample. Your full dataset has 200k labeled images You want to quickly scale your training workload while minimizing cost. You plan to use 4 V100 GPUs. What should you do? (Choose Correct Answer and Give References and Explanation)

Options:

A.

Configure a Compute Engine VM with all the dependencies that launches the training Train your model with Vertex Al using a custom tier that contains the required GPUs.

B.

Package your code with Setuptools. and use a pre-built container Train your model with Vertex Al using a custom tier that contains the required GPUs.

C.

Create a Vertex Al Workbench user-managed notebooks instance with 4 V100 GPUs, and use it to train your model

D.

Create a Google Kubernetes Engine cluster with a node pool that has 4 V100 GPUs Prepare and submit a TFJob operator to this node pool.

Buy Now
Questions 21

You work at a mobile gaming startup that creates online multiplayer games Recently, your company observed an increase in players cheating in the games, leading to a loss of revenue and a poor user experience. You built a binary classification model to determine whether a player cheated after a completed game session, and then send a message to other downstream systems to ban the player that cheated Your model has performed well during testing, and you now need to deploy the model to production You want your serving solution to provide immediate classifications after a completed game session to avoid further loss of revenue. What should you do?

Options:

A.

Import the model into Vertex Al Model Registry. Use the Vertex Batch Prediction service to run batch inference jobs.

B.

Save the model files in a Cloud Storage Bucket Create a Cloud Function to read the model files and make online inference requests on the Cloud Function.

C.

Save the model files in a VM Load the model files each time there is a prediction request and run an inference job on the VM.

D.

Import the model into Vertex Al Model Registry Create a Vertex Al endpoint that hosts the model and make online inference requests.

Buy Now
Questions 22

You are building a predictive maintenance model to preemptively detect part defects in bridges. You plan to use high definition images of the bridges as model inputs. You need to explain the output of the model to the relevant stakeholders so they can take appropriate action. How should you build the model?

Options:

A.

Use scikit-learn to build a tree-based model, and use SHAP values to explain the model output.

B.

Use scikit-lean to build a tree-based model, and use partial dependence plots (PDP) to explain the model output.

C.

Use TensorFlow to create a deep learning-based model and use Integrated Gradients to explain the model

output.

D.

Use TensorFlow to create a deep learning-based model and use the sampled Shapley method to explain the model output.

Buy Now
Questions 23

You are building a TensorFlow model for a financial institution that predicts the impact of consumer spending on inflation globally. Due to the size and nature of the data, your model is long-running across all types of hardware, and you have built frequent checkpointing into the training process. Your organization has asked you to minimize cost. What hardware should you choose?

Options:

A.

A Vertex AI Workbench user-managed notebooks instance running on an n1-standard-16 with 4 NVIDIA P100 GPUs

B.

A Vertex AI Workbench user-managed notebooks instance running on an n1-standard-16 with an NVIDIA P100 GPU

C.

A Vertex AI Workbench user-managed notebooks instance running on an n1-standard-16 with a non-preemptible v3-8 TPU

D.

A Vertex AI Workbench user-managed notebooks instance running on an n1-standard-16 with a preemptible v3-8 TPU

Buy Now
Questions 24

You are building a MLOps platform to automate your company's ML experiments and model retraining. You need to organize the artifacts for dozens of pipelines How should you store the pipelines' artifacts'?

Options:

A.

Store parameters in Cloud SQL and store the models' source code and binaries in GitHub

B.

Store parameters in Cloud SQL store the models' source code in GitHub, and store the models' binaries in Cloud Storage.

C.

Store parameters in Vertex ML Metadata store the models' source code in GitHub and store the models' binaries in Cloud Storage.

D.

Store parameters in Vertex ML Metadata and store the models source code and binaries in GitHub.

Buy Now
Questions 25

You work on a growing team of more than 50 data scientists who all use Al Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?

Options:

A.

Set up restrictive I AM permissions on the Al Platform notebooks so that only a single user or group can access a given instance.

B.

Separate each data scientist's work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.

C.

Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources

D.

Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about Al Platform resource usage In BigQuery create a SQL view that maps users to the resources they are using.

Buy Now
Questions 26

You want to train an AutoML model to predict house prices by using a small public dataset stored in BigQuery. You need to prepare the data and want to use the simplest most efficient approach. What should you do?

Options:

A.

Write a query that preprocesses the data by using BigQuery and creates a new table Create a Vertex Al managed dataset with the new table as the data source.

B.

Use Dataflow to preprocess the data Write the output in TFRecord format to a Cloud Storage bucket.

C.

Write a query that preprocesses the data by using BigQuery Export the query results as CSV files and use

those files to create a Vertex Al managed dataset.

D.

Use a Vertex Al Workbench notebook instance to preprocess the data by using the pandas library Export the data as CSV files, and use those files to create a Vertex Al managed dataset.

Buy Now
Questions 27

You are creating a social media app where pet owners can post images of their pets. You have one million user uploaded images with hashtags. You want to build a comprehensive system that recommends images to users that are similar in appearance to their own uploaded images.

What should you do?

Options:

A.

Download a pretrained convolutional neural network, and fine-tune the model to predict hashtags based on the input images. Use the predicted hashtags to make recommendations.

B.

Retrieve image labels and dominant colors from the input images using the Vision API. Use these properties and the hashtags to make recommendations.

C.

Use the provided hashtags to create a collaborative filtering algorithm to make recommendations.

D.

Download a pretrained convolutional neural network, and use the model to generate embeddings of the input images. Measure similarity between embeddings to make recommendations.

Buy Now
Questions 28

You need to design an architecture that serves asynchronous predictions to determine whether a particular mission-critical machine part will fail. Your system collects data from multiple sensors from the machine. You want to build a model that will predict a failure in the next N minutes, given the average of each sensor’s data from the past 12 hours. How should you design the architecture?

Options:

A.

1. HTTP requests are sent by the sensors to your ML model, which is deployed as a microservice and exposes a REST API for prediction

2. Your application queries a Vertex AI endpoint where you deployed your model.

3. Responses are received by the caller application as soon as the model produces the prediction.

B.

1. Events are sent by the sensors to Pub/Sub, consumed in real time, and processed by a Dataflow stream processing pipeline.

2. The pipeline invokes the model for prediction and sends the predictions to another Pub/Sub topic.

3. Pub/Sub messages containing predictions are then consumed by a downstream system for monitoring.

C.

1. Export your data to Cloud Storage using Dataflow.

2. Submit a Vertex AI batch prediction job that uses your trained model in Cloud Storage to perform scoring on the preprocessed data.

3. Export the batch prediction job outputs from Cloud Storage and import them into Cloud SQL.

D.

1. Export the data to Cloud Storage using the BigQuery command-line tool

2. Submit a Vertex AI batch prediction job that uses your trained model in Cloud Storage to perform scoring on the preprocessed data.

3. Export the batch prediction job outputs from Cloud Storage and import them into BigQuery.

Buy Now
Questions 29

You are the Director of Data Science at a large company, and your Data Science team has recently begun using the Kubeflow Pipelines SDK to orchestrate their training pipelines. Your team is struggling to integrate their custom Python code into the Kubeflow Pipelines SDK. How should you instruct them to proceed in order to quickly integrate their code with the Kubeflow Pipelines SDK?

Options:

A.

Use the func_to_container_op function to create custom components from the Python code.

B.

Use the predefined components available in the Kubeflow Pipelines SDK to access Dataproc, and run the custom code there.

C.

Package the custom Python code into Docker containers, and use the load_component_from_file function to import the containers into the pipeline.

D.

Deploy the custom Python code to Cloud Functions, and use Kubeflow Pipelines to trigger the Cloud Function.

Buy Now
Questions 30

You need to build classification workflows over several structured datasets currently stored in BigQuery. Because you will be performing the classification several times, you want to complete the following steps without writing code: exploratory data analysis, feature selection, model building, training, and hyperparameter tuning and serving. What should you do?

Options:

A.

Configure AutoML Tables to perform the classification task

B.

Run a BigQuery ML task to perform logistic regression for the classification

C.

Use Al Platform Notebooks to run the classification model with pandas library

D.

Use Al Platform to run the classification model job configured for hyperparameter tuning

Buy Now
Questions 31

You recently deployed a pipeline in Vertex Al Pipelines that trains and pushes a model to a Vertex Al endpoint to serve real-time traffic. You need to continue experimenting and iterating on your pipeline to improve model performance. You plan to use Cloud Build for CI/CD You want to quickly and easily deploy new pipelines into production and you want to minimize the chance that the new pipeline implementations will break in production. What should you do?

Options:

A.

Set up a CI/CD pipeline that builds and tests your source code If the tests are successful use the Google Cloud console to upload the built container to Artifact Registry and upload the compiled pipeline to Vertex Al Pipelines.

B.

Set up a CI/CD pipeline that builds your source code and then deploys built artifacts into a pre-production environment Run unit tests in the pre-production environment If the tests are successful deploy the pipeline to production.

C.

Set up a CI/CD pipeline that builds and tests your source code and then deploys built artifacts into a pre-production environment. After a successful pipeline run in the pre-production environment deploy the pipeline to production

D.

Set up a CI/CD pipeline that builds and tests your source code and then deploys built arrets into a pre-production environment After a successful pipeline run in the pre-production environment, rebuild the source code, and deploy the artifacts to production

Buy Now
Questions 32

You are developing an image recognition model using PyTorch based on ResNet50 architecture Your code is working fine on your local laptop on a small subsample. Your full dataset has 200k labeled images You want to quickly scale your training workload while minimizing cost. You plan to use 4 V100 GPUs What should you do?

Options:

A.

Create a Google Kubernetes Engine cluster with a node pool that has 4 V100 GPUs Prepare and submit a TFJob operator to this node pool.

B.

Configure a Compute Engine VM with all the dependencies that launches the training Tram your model with Vertex Al using a custom tier that contains the required GPUs.

C.

Create a Vertex Al Workbench user-managed notebooks instance with 4 V100 GPUs, and use it to tram your model.

D.

Package your code with Setuptools and use a pre-built container. Train your model with Vertex Al using a custom tier that contains the required GPUs.

Buy Now
Questions 33

You work at a subscription-based company. You have trained an ensemble of trees and neural networks to predict customer churn, which is the likelihood that customers will not renew their yearly subscription. The average prediction is a 15% churn rate, but for a particular customer the model predicts that they are 70% likely to churn. The customer has a product usage history of 30%, is located in New York City, and became a customer in 1997. You need to explain the difference between the actual prediction, a 70% churn rate, and the average prediction. You want to use Vertex Explainable AI. What should you do?

Options:

A.

Train local surrogate models to explain individual predictions.

B.

Configure sampled Shapley explanations on Vertex Explainable AI.

C.

Configure integrated gradients explanations on Vertex Explainable AI.

D.

Measure the effect of each feature as the weight of the feature multiplied by the feature value.

Buy Now
Questions 34

You work for a gaming company that develops massively multiplayer online (MMO) games. You built a TensorFlow model that predicts whether players will make in-app purchases of more than $10 in the next two weeks. The model’s predictions will be used to adapt each user’s game experience. User data is stored in BigQuery. How should you serve your model while optimizing cost, user experience, and ease of management?

Options:

A.

Import the model into BigQuery ML. Make predictions using batch reading data from BigQuery, and push the data to Cloud SQL

B.

Deploy the model to Vertex AI Prediction. Make predictions using batch reading data from Cloud Bigtable, and push the data to Cloud SQL.

C.

Embed the model in the mobile application. Make predictions after every in-app purchase event is published in Pub/Sub, and push the data to Cloud SQL.

D.

Embed the model in the streaming Dataflow pipeline. Make predictions after every in-app purchase event is published in Pub/Sub, and push the data to Cloud SQL.

Buy Now
Questions 35

You recently created a new Google Cloud Project After testing that you can submit a Vertex Al Pipeline job from the Cloud Shell, you want to use a Vertex Al Workbench user-managed notebook instance to run your code from that instance You created the instance and ran the code but this time the job fails with an insufficient permissions error. What should you do?

Options:

A.

Ensure that the Workbench instance that you created is in the same region of the Vertex Al Pipelines resources you will use.

B.

Ensure that the Vertex Al Workbench instance is on the same subnetwork of the Vertex Al Pipeline resources that you will use.

C.

Ensure that the Vertex Al Workbench instance is assigned the Identity and Access Management (1AM) Vertex Al User rote.

D.

Ensure that the Vertex Al Workbench instance is assigned the Identity and Access Management (1AM) Notebooks Runner role.

Buy Now
Questions 36

You are developing a process for training and running your custom model in production. You need to be able to show lineage for your model and predictions. What should you do?

Options:

A.

1 Create a Vertex Al managed dataset

2 Use a Vertex Ai training pipeline to train your model

3 Generate batch predictions in Vertex Al

B.

1 Use a Vertex Al Pipelines custom training job component to train your model

2. Generate predictions by using a Vertex Al Pipelines model batch predict component

C.

1 Upload your dataset to BigQuery

2. Use a Vertex Al custom training job to train your model

3 Generate predictions by using Vertex Al SDK custom prediction routines

D.

1 Use Vertex Al Experiments to train your model.

2 Register your model in Vertex Al Model Registry

3. Generate batch predictions in Vertex Al

Buy Now
Questions 37

You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company’s logo. In the dataset, 96% of examples don’t have the logo, so the dataset is very skewed. Which metrics would give you the most confidence in your model?

Options:

A.

F-score where recall is weighed more than precision

B.

RMSE

C.

F1 score

D.

F-score where precision is weighed more than recall

Buy Now
Questions 38

You work on the data science team at a manufacturing company. You are reviewing the company's historical sales data, which has hundreds of millions of records. For your exploratory data analysis, you need to calculate descriptive statistics such as mean, median, and mode; conduct complex statistical tests for hypothesis testing; and plot variations of the features over time You want to use as much of the sales data as possible in your analyses while minimizing computational resources. What should you do?

Options:

A.

Spin up a Vertex Al Workbench user-managed notebooks instance and import the dataset Use this data to create statistical and visual analyses

B.

Visualize the time plots in Google Data Studio. Import the dataset into Vertex Al Workbench user-managed notebooks Use this data to calculate the descriptive statistics and run the statistical analyses

C.

Use BigQuery to calculate the descriptive statistics. Use Vertex Al Workbench user-managed notebooks to visualize the time plots and run the statistical analyses.

D Use BigQuery to calculate the descriptive statistics, and use Google Data Studio to visualize the time plots. Use Vertex Al Workbench user-managed notebooks to run the statistical analyses.

Buy Now
Questions 39

You are implementing a batch inference ML pipeline in Google Cloud. The model was developed using TensorFlow and is stored in SavedModel format in Cloud Storage You need to apply the model to a historical dataset containing 10 TB of data that is stored in a BigQuery table How should you perform the inference?

Options:

A.

Export the historical data to Cloud Storage in Avro format. Configure a Vertex Al batch prediction job to generate predictions for the exported data.

B.

Import the TensorFlow model by using the create model statement in BigQuery ML Apply the historical data to the TensorFlow model.

C.

Export the historical data to Cloud Storage in CSV format Configure a Vertex Al batch prediction job to generate predictions for the exported data.

D.

Configure a Vertex Al batch prediction job to apply the model to the historical data in BigQuery

Buy Now
Questions 40

You need to execute a batch prediction on 100 million records in a BigQuery table with a custom TensorFlow DNN regressor model, and then store the predicted results in a BigQuery table. You want to minimize the effort required to build this inference pipeline. What should you do?

Options:

A.

Import the TensorFlow model with BigQuery ML, and run the ml.predict function.

B.

Use the TensorFlow BigQuery reader to load the data, and use the BigQuery API to write the results to BigQuery.

C.

Create a Dataflow pipeline to convert the data in BigQuery to TFRecords. Run a batch inference on Vertex AI Prediction, and write the results to BigQuery.

D.

Load the TensorFlow SavedModel in a Dataflow pipeline. Use the BigQuery I/O connector with a custom function to perform the inference within the pipeline, and write the results to BigQuery.

Buy Now
Questions 41

Your team frequently creates new ML models and runs experiments. Your team pushes code to a single repository hosted on Cloud Source Repositories. You want to create a continuous integration pipeline that automatically retrains the models whenever there is any modification of the code. What should be your first step to set up the CI pipeline?

Options:

A.

Configure a Cloud Build trigger with the event set as "Pull Request"

B.

Configure a Cloud Build trigger with the event set as "Push to a branch"

C.

Configure a Cloud Function that builds the repository each time there is a code change.

D.

Configure a Cloud Function that builds the repository each time a new branch is created.

Buy Now
Questions 42

Your team has a model deployed to a Vertex Al endpoint You have created a Vertex Al pipeline that automates the model training process and is triggered by a Cloud Function. You need to prioritize keeping the model up-to-date, but also minimize retraining costs. How should you configure retraining'?

Options:

A.

Configure Pub/Sub to call the Cloud Function when a sufficient amount of new data becomes available.

B.

Configure a Cloud Scheduler job that calls the Cloud Function at a predetermined frequency that fits your team's budget.

C.

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when anomalies are detected.

D.

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when feature drift is detected.

Buy Now
Questions 43

You are developing an ML pipeline using Vertex Al Pipelines. You want your pipeline to upload a new version of the XGBoost model to Vertex Al Model Registry and deploy it to Vertex Al End points for online inference. You want to use the simplest approach. What should you do?

Options:

A.

Use the Vertex Al REST API within a custom component based on a vertex-ai/prediction/xgboost-cpu image.

B.

Use the Vertex Al ModelEvaluationOp component to evaluate the model.

C.

Use the Vertex Al SDK for Python within a custom component based on a python: 3.10 Image.

D.

Chain the Vertex Al ModelUploadOp and ModelDeployop components together.

Buy Now
Questions 44

Your team is working on an NLP research project to predict political affiliation of authors based on articles they have written. You have a large training dataset that is structured like this:

You followed the standard 80%-10%-10% data distribution across the training, testing, and evaluation subsets. How should you distribute the training examples across the train-test-eval subsets while maintaining the 80-10-10 proportion?

A)

B)

C)

D)

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Buy Now
Questions 45

You work for an online travel agency that also sells advertising placements on its website to other companies.

You have been asked to predict the most relevant web banner that a user should see next. Security is

important to your company. The model latency requirements are 300ms@p99, the inventory is thousands of web banners, and your exploratory analysis has shown that navigation context is a good predictor. You want to Implement the simplest solution. How should you configure the prediction pipeline?

Options:

A.

Embed the client on the website, and then deploy the model on AI Platform Prediction.

B.

Embed the client on the website, deploy the gateway on App Engine, and then deploy the model on AI Platform Prediction.

C.

Embed the client on the website, deploy the gateway on App Engine, deploy the database on Cloud

Bigtable for writing and for reading the user’s navigation context, and then deploy the model on AI Platform Prediction.

D.

Embed the client on the website, deploy the gateway on App Engine, deploy the database on Memorystore for writing and for reading the user’s navigation context, and then deploy the model on Google Kubernetes Engine.

Buy Now
Questions 46

You are developing a custom image classification model in Python. You plan to run your training application on Vertex Al Your input dataset contains several hundred thousand small images You need to determine how to store and access the images for training. You want to maximize data throughput and minimize training time while reducing the amount of additional code. What should you do?

Options:

A.

Store image files in Cloud Storage and access them directly.

B.

Store image files in Cloud Storage and access them by using serialized records.

C.

Store image files in Cloud Filestore, and access them by using serialized records.

D.

Store image files in Cloud Filestore and access them directly by using an NFS mount point.

Buy Now
Questions 47

You are developing an ML model in a Vertex Al Workbench notebook. You want to track artifacts and compare models during experimentation using different approaches. You need to rapidly and easily transition successful experiments to production as you iterate on your model implementation. What should you do?

Options:

A.

1 Initialize the Vertex SDK with the name of your experiment Log parameters and metrics for each experiment, and attach dataset and model artifacts as inputs and outputs to each execution.

2 After a successful experiment create a Vertex Al pipeline.

B.

1. Initialize the Vertex SDK with the name of your experiment Log parameters and metrics for each experiment, save your dataset to a Cloud Storage bucket and upload the models to Vertex Al Model Registry.

2 After a successful experiment create a Vertex Al pipeline.

C.

1 Create a Vertex Al pipeline with parameters you want to track as arguments to your Pipeline Job Use the Metrics. Model, and Dataset artifact types from the Kubeflow Pipelines DSL as the inputs and outputs of the components in your pipeline.

2. Associate the pipeline with your experiment when you submit the job.

D.

1 Create a Vertex Al pipeline Use the Dataset and Model artifact types from the Kubeflow Pipelines. DSL as the inputs and outputs of the components in your pipeline.

2. In your training component use the Vertex Al SDK to create an experiment run Configure the log_params and log_metrics functions to track parameters and metrics of your experiment.

Buy Now
Questions 48

You have been asked to productionize a proof-of-concept ML model built using Keras. The model was trained in a Jupyter notebook on a data scientist’s local machine. The notebook contains a cell that performs data validation and a cell that performs model analysis. You need to orchestrate the steps contained in the notebook and automate the execution of these steps for weekly retraining. You expect much more training data in the future. You want your solution to take advantage of managed services while minimizing cost. What should you do?

Options:

A.

Move the Jupyter notebook to a Notebooks instance on the largest N2 machine type, and schedule the execution of the steps in the Notebooks instance using Cloud Scheduler.

B.

Write the code as a TensorFlow Extended (TFX) pipeline orchestrated with Vertex AI Pipelines. Use standard TFX components for data validation and model analysis, and use Vertex AI Pipelines for model retraining.

C.

Rewrite the steps in the Jupyter notebook as an Apache Spark job, and schedule the execution of the job on ephemeral Dataproc clusters using Cloud Scheduler.

D.

Extract the steps contained in the Jupyter notebook as Python scripts, wrap each script in an Apache Airflow BashOperator, and run the resulting directed acyclic graph (DAG) in Cloud Composer.

Buy Now
Questions 49

You work for a rapidly growing social media company. Your team builds TensorFlow recommender models in an on-premises CPU cluster. The data contains billions of historical user events and 100 000 categorical features. You notice that as the data increases the model training time increases. You plan to move the models to Google Cloud You want to use the most scalable approach that also minimizes training time. What should you do?

Options:

A.

Deploy the training jobs by using TPU VMs with TPUv3 Pod slices, and use the TPUEmbedding API.

B.

Deploy the training jobs in an autoscaling Google Kubernetes Engine cluster with CPUs

C.

Deploy a matrix factorization model training job by using BigQuery ML.

D.

Deploy the training jobs by using Compute Engine instances with A100 GPUs and use the

t f. nn. embedding_lookup API.

Buy Now
Questions 50

Your data science team has requested a system that supports scheduled model retraining, Docker containers, and a service that supports autoscaling and monitoring for online prediction requests. Which platform components should you choose for this system?

Options:

A.

Vertex AI Pipelines and App Engine

B.

Vertex AI Pipelines and Al Platform Prediction

C.

Cloud Composer, BigQuery ML , and Al Platform Prediction

D.

Cloud Composer, Al Platform Training with custom containers, and App Engine

Buy Now
Questions 51

You work as an ML engineer at a social media company, and you are developing a visual filter for users’ profile photos. This requires you to train an ML model to detect bounding boxes around human faces. You want to use this filter in your company’s iOS-based mobile phone application. You want to minimize code development and want the model to be optimized for inference on mobile phones. What should you do?

Options:

A.

Train a model using AutoML Vision and use the “export for Core ML” option.

B.

Train a model using AutoML Vision and use the “export for Coral” option.

C.

Train a model using AutoML Vision and use the “export for TensorFlow.js” option.

D.

Train a custom TensorFlow model and convert it to TensorFlow Lite (TFLite).

Buy Now
Questions 52

Your organization wants to make its internal shuttle service route more efficient. The shuttles currently stop at all pick-up points across the city every 30 minutes between 7 am and 10 am. The development team has already built an application on Google Kubernetes Engine that requires users to confirm their presence and shuttle station one day in advance. What approach should you take?

Options:

A.

1. Build a tree-based regression model that predicts how many passengers will be picked up at each shuttle station.

2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the prediction.

B.

1. Build a tree-based classification model that predicts whether the shuttle should pick up passengers at each shuttle station.

2. Dispatch an available shuttle and provide the map with the required stops based on the prediction

C.

1. Define the optimal route as the shortest route that passes by all shuttle stations with confirmed attendance at the given time under capacity constraints.

2 Dispatch an appropriately sized shuttle and indicate the required stops on the map

D.

1. Build a reinforcement learning model with tree-based classification models that predict the presence of passengers at shuttle stops as agents and a reward function around a distance-based metric

2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the simulated outcome.

Buy Now
Questions 53

You work for a manufacturing company. You need to train a custom image classification model to detect product defects at the end of an assembly line Although your model is performing well some images in your holdout set are consistently mislabeled with high confidence You want to use Vertex Al to understand your model's results What should you do?

Options:

A.

B.

C.

D.

Buy Now
Questions 54

Your company manages an application that aggregates news articles from many different online sources and sends them to users. You need to build a recommendation model that will suggest articles to readers that are similar to the articles they are currently reading. Which approach should you use?

Options:

A.

Create a collaborative filtering system that recommends articles to a user based on the user’s past behavior.

B.

Encode all articles into vectors using word2vec, and build a model that returns articles based on vector similarity.

C.

Build a logistic regression model for each user that predicts whether an article should be recommended to a user.

D.

Manually label a few hundred articles, and then train an SVM classifier based on the manually classified articles that categorizes additional articles into their respective categories.

Buy Now
Questions 55

You are a data scientist at an industrial equipment manufacturing company. You are developing a regression model to estimate the power consumption in the company’s manufacturing plants based on sensor data collected from all of the plants. The sensors collect tens of millions of records every day. You need to schedule daily training runs for your model that use all the data collected up to the current date. You want your model to scale smoothly and require minimal development work. What should you do?

Options:

A.

Develop a custom TensorFlow regression model, and optimize it using Vertex Al Training.

B.

Develop a regression model using BigQuery ML.

C.

Develop a custom scikit-learn regression model, and optimize it using Vertex Al Training

D.

Develop a custom PyTorch regression model, and optimize it using Vertex Al Training

Buy Now
Questions 56

You are building a TensorFlow text-to-image generative model by using a dataset that contains billions of images with their respective captions. You want to create a low maintenance, automated workflow that reads the data from a Cloud Storage bucket collects statistics, splits the dataset into training/validation/test datasets performs data transformations, trains the model using the training/validation datasets. and validates the model by using the test dataset. What should you do?

Options:

A.

Use the Apache Airflow SDK to create multiple operators that use Dataflow and Vertex Al services Deploy the workflow on Cloud Composer.

B.

Use the MLFlow SDK and deploy it on a Google Kubernetes Engine Cluster Create multiple components that use Dataflow and Vertex Al services.

C.

Use the Kubeflow Pipelines (KFP) SDK to create multiple components that use Dataflow and Vertex Al services Deploy the workflow on Vertex Al Pipelines.

D.

Use the TensorFlow Extended (TFX) SDK to create multiple components that use Dataflow and Vertex Al services Deploy the workflow on Vertex Al Pipelines.

Buy Now
Questions 57

Your team needs to build a model that predicts whether images contain a driver's license, passport, or credit card. The data engineering team already built the pipeline and generated a dataset composed of 10,000 images with driver's licenses, 1,000 images with passports, and 1,000 images with credit cards. You now have to train a model with the following label map: ['driversjicense', 'passport', 'credit_card']. Which loss function should you use?

Options:

A.

Categorical hinge

B.

Binary cross-entropy

C.

Categorical cross-entropy

D.

Sparse categorical cross-entropy

Buy Now
Questions 58

You are training models in Vertex Al by using data that spans across multiple Google Cloud Projects You need to find track, and compare the performance of the different versions of your models Which Google Cloud services should you include in your ML workflow?

Options:

A.

Dataplex. Vertex Al Feature Store and Vertex Al TensorBoard

B.

Vertex Al Pipelines, Vertex Al Feature Store, and Vertex Al Experiments

C.

Dataplex. Vertex Al Experiments, and Vertex Al ML Metadata

D.

Vertex Al Pipelines: Vertex Al Experiments and Vertex Al Metadata

Buy Now
Questions 59

You work for a delivery company. You need to design a system that stores and manages features such as parcels delivered and truck locations over time. The system must retrieve the features with low latency and feed those features into a model for online prediction. The data science team will retrieve historical data at a specific point in time for model training. You want to store the features with minimal effort. What should you do?

Options:

A.

Store features in Bigtable as key/value data.

B.

Store features in Vertex Al Feature Store.

C.

Store features as a Vertex Al dataset and use those features to tram the models hosted in Vertex Al endpoints.

D.

Store features in BigQuery timestamp partitioned tables, and use the BigQuery Storage Read API to serve the features.

Buy Now
Questions 60

You have deployed a scikit-learn model to a Vertex Al endpoint using a custom model server. You enabled auto scaling; however, the deployed model fails to scale beyond one replica, which led to dropped requests. You notice that CPU utilization remains low even during periods of high load. What should you do?

Options:

A.

Attach a GPU to the prediction nodes.

B.

Increase the number of workers in your model server.

C.

Schedule scaling of the nodes to match expected demand.

D.

Increase the minReplicaCount in your DeployedModel configuration.

Buy Now
Questions 61

You work for a bank with strict data governance requirements. You recently implemented a custom model to detect fraudulent transactions You want your training code to download internal data by using an API endpoint hosted in your projects network You need the data to be accessed in the most secure way, while mitigating the risk of data exfiltration. What should you do?

Options:

A.

Enable VPC Service Controls for peering’s, and add Vertex Al to a service perimeter

B.

Create a Cloud Run endpoint as a proxy to the data Use Identity and Access Management (1AM)

authentication to secure access to the endpoint from the training job.

C.

Configure VPC Peering with Vertex Al and specify the network of the training job

D.

Download the data to a Cloud Storage bucket before calling the training job

Buy Now
Questions 62

You work for a gaming company that manages a popular online multiplayer game where teams with 6 players play against each other in 5-minute battles. There are many new players every day. You need to build a model that automatically assigns available players to teams in real time. User research indicates that the game is more enjoyable when battles have players with similar skill levels. Which business metrics should you track to measure your model’s performance? (Choose One Correct Answer)

Options:

A.

Average time players wait before being assigned to a team

B.

Precision and recall of assigning players to teams based on their predicted versus actual ability

C.

User engagement as measured by the number of battles played daily per user

D.

Rate of return as measured by additional revenue generated minus the cost of developing a new model

Buy Now
Questions 63

You want to rebuild your ML pipeline for structured data on Google Cloud. You are using PySpark to conduct data transformations at scale, but your pipelines are taking over 12 hours to run. To speed up development and pipeline run time, you want to use a serverless tool and SQL syntax. You have already moved your raw data into Cloud Storage. How should you build the pipeline on Google Cloud while meeting the speed and processing requirements?

Options:

A.

Use Data Fusion's GUI to build the transformation pipelines, and then write the data into BigQuery

B.

Convert your PySpark into SparkSQL queries to transform the data and then run your pipeline on Dataproc to write the data into BigQuery.

C.

Ingest your data into Cloud SQL convert your PySpark commands into SQL queries to transform the data, and then use federated queries from BigQuery for machine learning

D.

Ingest your data into BigQuery using BigQuery Load, convert your PySpark commands into BigQuery SQL queries to transform the data, and then write the transformations to a new table

Buy Now
Questions 64

You recently trained an XGBoost model on tabular data You plan to expose the model for internal use as an HTTP microservice After deployment you expect a small number of incoming requests. You want to productionize the model with the least amount of effort and latency. What should you do?

Options:

A.

Deploy the model to BigQuery ML by using CREATE model with the BOOSTED-THREE-REGRESSOR statement and invoke the BigQuery API from the microservice.

B.

Build a Flask-based app Package the app in a custom container on Vertex Al and deploy it to Vertex Al Endpoints.

C.

Build a Flask-based app Package the app in a Docker image and deploy it to Google Kubernetes Engine in Autopilot mode.

D.

Use a prebuilt XGBoost Vertex container to create a model and deploy it to Vertex Al Endpoints.

Buy Now
Questions 65

You have a large corpus of written support cases that can be classified into 3 separate categories: Technical Support, Billing Support, or Other Issues. You need to quickly build, test, and deploy a service that will automatically classify future written requests into one of the categories. How should you configure the pipeline?

Options:

A.

Use the Cloud Natural Language API to obtain metadata to classify the incoming cases.

B.

Use AutoML Natural Language to build and test a classifier. Deploy the model as a REST API.

C.

Use BigQuery ML to build and test a logistic regression model to classify incoming requests. Use BigQuery ML to perform inference.

D.

Create a TensorFlow model using Google’s BERT pre-trained model. Build and test a classifier, and deploy the model using Vertex AI.

Buy Now
Questions 66

You are developing ML models with Al Platform for image segmentation on CT scans. You frequently update your model architectures based on the newest available research papers, and have to rerun training on the same dataset to benchmark their performance. You want to minimize computation costs and manual intervention while having version control for your code. What should you do?

Options:

A.

Use Cloud Functions to identify changes to your code in Cloud Storage and trigger a retraining job

B.

Use the gcloud command-line tool to submit training jobs on Al Platform when you update your code

C.

Use Cloud Build linked with Cloud Source Repositories to trigger retraining when new code is pushed to the repository

D.

Create an automated workflow in Cloud Composer that runs daily and looks for changes in code in Cloud Storage using a sensor.

Buy Now
Questions 67

You have recently created a proof-of-concept (POC) deep learning model. You are satisfied with the overall architecture, but you need to determine the value for a couple of hyperparameters. You want to perform hyperparameter tuning on Vertex AI to determine both the appropriate embedding dimension for a categorical feature used by your model and the optimal learning rate. You configure the following settings:

For the embedding dimension, you set the type to INTEGER with a minValue of 16 and maxValue of 64.

For the learning rate, you set the type to DOUBLE with a minValue of 10e-05 and maxValue of 10e-02.

You are using the default Bayesian optimization tuning algorithm, and you want to maximize model accuracy. Training time is not a concern. How should you set the hyperparameter scaling for each hyperparameter and the maxParallelTrials?

Options:

A.

Use UNIT_LINEAR_SCALE for the embedding dimension, UNIT_LOG_SCALE for the learning rate, and a large number of parallel trials.

B.

Use UNIT_LINEAR_SCALE for the embedding dimension, UNIT_LOG_SCALE for the learning rate, and a small number of parallel trials.

C.

Use UNIT_LOG_SCALE for the embedding dimension, UNIT_LINEAR_SCALE for the learning rate, and a large number of parallel trials.

D.

Use UNIT_LOG_SCALE for the embedding dimension, UNIT_LINEAR_SCALE for the learning rate, and a small number of parallel trials.

Buy Now
Questions 68

You need to use TensorFlow to train an image classification model. Your dataset is located in a Cloud Storage directory and contains millions of labeled images Before training the model, you need to prepare the data. You want the data preprocessing and model training workflow to be as efficient scalable, and low maintenance as possible. What should you do?

Options:

A.

1 Create a Dataflow job that creates sharded TFRecord files in a Cloud Storage directory.

2 Reference tf .data.TFRecordDataset in the training script.

3. Train the model by using Vertex Al Training with a V100 GPU.

B.

1 Create a Dataflow job that moves the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label.

2 Reference tfds.fclder_da-asst.imageFclder in the training script.

3. Train the model by using Vertex AI Training with a V100 GPU.

C.

1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance.

2 Write a Python script that creates sharded TFRecord files in a directory inside the instance

3. Reference tf. da-a.TFRecrrdDataset in the training script.

4. Train the model by using the Workbench instance.

D.

1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance.

2 Write a Python scnpt that copies the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label.

3 Reference tf ds. f older_dataset. imageFolder in the training script.

4. Train the model by using the Workbench instance.

Buy Now
Questions 69

You are building a real-time prediction engine that streams files which may contain Personally Identifiable Information (Pll) to Google Cloud. You want to use the Cloud Data Loss Prevention (DLP) API to scan the files. How should you ensure that the Pll is not accessible by unauthorized individuals?

Options:

A.

Stream all files to Google CloudT and then write the data to BigQuery Periodically conduct a bulk scan of the table using the DLP API.

B.

Stream all files to Google Cloud, and write batches of the data to BigQuery While the data is being written to BigQuery conduct a bulk scan of the data using the DLP API.

C.

Create two buckets of data Sensitive and Non-sensitive Write all data to the Non-sensitive bucket Periodically conduct a bulk scan of that bucket using the DLP API, and move the sensitive data to the Sensitive bucket

D.

Create three buckets of data: Quarantine, Sensitive, and Non-sensitive Write all data to the Quarantine bucket.

E.

Periodically conduct a bulk scan of that bucket using the DLP API, and move the data to either the Sensitive or Non-Sensitive bucket

Buy Now
Questions 70

You are developing an ML model that predicts the cost of used automobiles based on data such as location, condition model type color, and engine-'battery efficiency. The data is updated every night Car dealerships will use the model to determine appropriate car prices. You created a Vertex Al pipeline that reads the data splits the data into training/evaluation/test sets performs feature engineering trains the model by using the training dataset and validates the model by using the evaluation dataset. You need to configure a retraining workflow that minimizes cost What should you do?

Options:

A.

Compare the training and evaluation losses of the current run If the losses are similar, deploy the model to a Vertex AI endpoint Configure a cron job to redeploy the pipeline every night.

B.

Compare the training and evaluation losses of the current run If the losses are similar deploy the model to a Vertex Al endpoint with training/serving skew threshold model monitoring When the model monitoring threshold is tnggered redeploy the pipeline.

C.

Compare the results to the evaluation results from a previous run If the performance improved deploy the model to a Vertex Al endpoint Configure a cron job to redeploy the pipeline every night.

D.

Compare the results to the evaluation results from a previous run If the performance improved deploy the model to a Vertex Al endpoint with training/serving skew threshold model monitoring. When the model monitoring threshold is triggered, redeploy the pipeline.

Buy Now
Questions 71

You developed a Transformer model in TensorFlow to translate text Your training data includes millions of documents in a Cloud Storage bucket. You plan to use distributed training to reduce training time. You need to configure the training job while minimizing the effort required to modify code and to manage the clusters configuration. What should you do?

Options:

A.

Create a Vertex Al custom training job with GPU accelerators for the second worker pool Use tf .distribute.MultiWorkerMirroredStrategy for distribution.

B.

Create a Vertex Al custom distributed training job with Reduction Server Use N1 high-memory machine type instances for the first and second pools, and use N1 high-CPU machine type instances for the third worker pool.

C.

Create a training job that uses Cloud TPU VMs Use tf.distribute.TPUStrategy for distribution.

D.

Create a Vertex Al custom training job with a single worker pool of A2 GPU machine type instances Use tf .distribute.MirroredStraregy for distribution.

Buy Now
Questions 72

You work for a company that is developing an application to help users with meal planning You want to use machine learning to scan a corpus of recipes and extract each ingredient (e g carrot, rice pasta) and each kitchen cookware (e.g. bowl, pot spoon) mentioned Each recipe is saved in an unstructured text file What should you do?

Options:

A.

Create a text dataset on Vertex Al for entity extraction Create two entities called ingredient" and cookware" and label at least 200 examples of each entity Train an AutoML entity extraction model to extract occurrences of these entity types Evaluate performance on a holdout dataset.

B.

Create a multi-label text classification dataset on Vertex Al Create a test dataset and label each recipe that corresponds to its ingredients and cookware Train a multi-class classification model Evaluate the model’s performance on a holdout dataset.

C.

Use the Entity Analysis method of the Natural Language API to extract the ingredients and cookware from each recipe Evaluate the model's performance on a prelabeled dataset.

D.

Create a text dataset on Vertex Al for entity extraction Create as many entities as there are different ingredients and cookware Train an AutoML entity extraction model to extract those entities Evaluate the models performance on a holdout dataset.

Buy Now
Questions 73

You have built a model that is trained on data stored in Parquet files. You access the data through a Hive table hosted on Google Cloud. You preprocessed these data with PySpark and exported it as a CSV file into Cloud Storage. After preprocessing, you execute additional steps to train and evaluate your model. You want to parametrize this model training in Kubeflow Pipelines. What should you do?

Options:

A.

Remove the data transformation step from your pipeline.

B.

Containerize the PySpark transformation step, and add it to your pipeline.

C.

Add a ContainerOp to your pipeline that spins a Dataproc cluster, runs a transformation, and then saves the transformed data in Cloud Storage.

D.

Deploy Apache Spark at a separate node pool in a Google Kubernetes Engine cluster. Add a ContainerOp to your pipeline that invokes a corresponding transformation job for this Spark instance.

Buy Now
Questions 74

You are training an ML model on a large dataset. You are using a TPU to accelerate the training process You notice that the training process is taking longer than expected. You discover that the TPU is not reaching its full capacity. What should you do?

Options:

A.

Increase the learning rate

B.

Increase the number of epochs

C.

Decrease the learning rate

D.

Increase the batch size

Buy Now
Questions 75

Your company manages an ecommerce website. You developed an ML model that recommends additional products to users in near real time based on items currently in the user's cart. The workflow will include the following processes.

1 The website will send a Pub/Sub message with the relevant data and then receive a message with the prediction from Pub/Sub.

2 Predictions will be stored in BigQuery

3. The model will be stored in a Cloud Storage bucket and will be updated frequently

You want to minimize prediction latency and the effort required to update the model How should you reconfigure the architecture?

Options:

A.

Write a Cloud Function that loads the model into memory for prediction Configure the function to be

triggered when messages are sent to Pub/Sub.

B.

Create a pipeline in Vertex Al Pipelines that performs preprocessing, prediction and postprocessing

Configure the pipeline to be triggered by a Cloud Function when messages are sent to Pub/Sub.

C.

Expose the model as a Vertex Al endpoint Write a custom DoFn in a Dataflow job that calls the endpoint for

prediction.

D.

Use the Runlnference API with watchFilePatterr. in a Dataflow job that wraps around the model and serves predictions.

Buy Now
Questions 76

You are using Kubeflow Pipelines to develop an end-to-end PyTorch-based MLOps pipeline. The pipeline reads data from BigQuery,

processes the data, conducts feature engineering, model training, model evaluation, and deploys the model as a binary file to Cloud Storage. You are

writing code for several different versions of the feature engineering and model training steps, and running each new version in Vertex Al Pipelines.

Each pipeline run is taking over an hour to complete. You want to speed up the pipeline execution to reduce your development time, and you want to

avoid additional costs. What should you do?

Options:

A.

Delegate feature engineering to BigQuery and remove it from the pipeline.

B.

Add a GPU to the model training step.

C.

Enable caching in all the steps of the Kubeflow pipeline.

D.

Comment out the part of the pipeline that you are not currently updating.

Buy Now
Questions 77

You are experimenting with a built-in distributed XGBoost model in Vertex AI Workbench user-managed notebooks. You use BigQuery to split your data into training and validation sets using the following queries:

CREATE OR REPLACE TABLE ‘myproject.mydataset.training‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.8);

CREATE OR REPLACE TABLE ‘myproject.mydataset.validation‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.2);

After training the model, you achieve an area under the receiver operating characteristic curve (AUC ROC) value of 0.8, but after deploying the model to production, you notice that your model performance has dropped to an AUC ROC value of 0.65. What problem is most likely occurring?

Options:

A.

There is training-serving skew in your production environment.

B.

There is not a sufficient amount of training data.

C.

The tables that you created to hold your training and validation records share some records, and you may not be using all the data in your initial table.

D.

The RAND() function generated a number that is less than 0.2 in both instances, so every record in the validation table will also be in the training table.

Buy Now
Questions 78

You have developed an application that uses a chain of multiple scikit-learn models to predict the optimal price for your company's products. The workflow logic is shown in the diagram Members of your team use the individual models in other solution workflows. You want to deploy this workflow while ensuring version control for each individual model and the overall workflow Your application needs to be able to scale down to zero. You want to minimize the compute resource utilization and the manual effort required to manage this solution. What should you do?

Options:

A.

Expose each individual model as an endpoint in Vertex Al Endpoints. Create a custom container endpoint to orchestrate the workflow.

B.

Create a custom container endpoint for the workflow that loads each models individual files Track the versions of each individual model in BigQuery.

C.

Expose each individual model as an endpoint in Vertex Al Endpoints. Use Cloud Run to orchestrate the workflow.

D.

Load each model's individual files into Cloud Run Use Cloud Run to orchestrate the workflow Track the versions of each individual model in BigQuery.

Buy Now
Questions 79

You work for a public transportation company and need to build a model to estimate delay times for multiple transportation routes. Predictions are served directly to users in an app in real time. Because different seasons and population increases impact the data relevance, you will retrain the model every month. You want to follow Google-recommended best practices. How should you configure the end-to-end architecture of the predictive model?

Options:

A.

Configure Kubeflow Pipelines to schedule your multi-step workflow from training to deploying your model.

B.

Use a model trained and deployed on BigQuery ML and trigger retraining with the scheduled query feature in BigQuery

C.

Write a Cloud Functions script that launches a training and deploying job on Ai Platform that is triggered by Cloud Scheduler

D.

Use Cloud Composer to programmatically schedule a Dataflow job that executes the workflow from training to deploying your model

Buy Now
Questions 80

You are creating a model training pipeline to predict sentiment scores from text-based product reviews. You want to have control over how the model parameters are tuned, and you will deploy the model to an endpoint after it has been trained You will use Vertex Al Pipelines to run the pipeline You need to decide which Google Cloud pipeline components to use What components should you choose?

Options:

A.

B.

C.

D.

Buy Now
Exam Code: Professional-Machine-Learning-Engineer
Exam Name: Google Professional Machine Learning Engineer
Last Update: Apr 26, 2024
Questions: 268
$64  $159.99
$48  $119.99
$40  $99.99
buy now Professional-Machine-Learning-Engineer