Black Friday Sale - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 65percent

Welcome To DumpsPedia

DA0-001 Sample Questions Answers

Questions 4

Which of the following best describes a 95% confidence interval?

Options:

A.

There is a 95% probability that a sample is within one standard deviation of the mean.

B.

A stated range may contain 95% of the population mean, 95% of the time.

C.

A set of ranges contains the population mean with 95% certainty.

D.

A range contains 95% of the population mean.

Buy Now
Questions 5

You would like to measure how well an organization is achieving its goals.

What type of analysis should you perform?

Options:

A.

Performance analysis.

B.

Outlier analysis.

C.

Predictive analysis.

D.

Trend analysis.

Buy Now
Questions 6

Which one of the following is a common data warehouse schema?

Options:

A.

Snowflake.

B.

Square.

C.

Spiral.

D.

Sphere.

Buy Now
Questions 7

A business intelligence engineer needs to reduce the size of a data model for reporting purposes. The data set contains more than one million rows, and the table has a date-time column named Date. Which of the following should the analyst do to complete this task?

Options:

A.

Change the data type of the Date column to text.

B.

Trim the date.

C.

Round the hour of the Date column to the start of the hour.

D.

Split the Date column into two columns—time and date.

Buy Now
Questions 8

An analyst needs to know what data an organization possesses. Which of the following is the best document for the analyst to consult?

Options:

A.

Data destruction policy

B.

Data use document

C.

Data dictionary

D.

Data retention policy

Buy Now
Questions 9

A cereal manufacturer wants to determine whether the sugar content of its cereal has increased over the years. Which of the following is the appropriate descriptive statistic to use?

Options:

A.

Frequency

B.

Percent change

C.

Variance

D.

Mean

Buy Now
Questions 10

Consider two different datasets, one with gas prices and the other with food prices. Which of the following measures is most affected by outliers?

Options:

A.

Absolute value

B.

Mode

C.

Median

D.

Mean

Buy Now
Questions 11

An analyst is updating a customer contacts database with information obtained from a survey of new customers. Which of the following data manipulation techniques should the analyst use?

Options:

A.

Join

B.

Append

C.

Transform

D.

Blend

Buy Now
Questions 12

A table in a hospital database has a column for patient height in inches and a column for patient height in centimeters. This is an example of:

Options:

A.

dependent data.

B.

duplicate data.

C.

invalid data

D.

redundant data

Buy Now
Questions 13

A client has requested an analysis of all pet care items purchased by current customers and their social media connections in the past 12 months. Which of the following data analysis techniques would be the best choice given these requirements?

Options:

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory data analysis

Buy Now
Questions 14

Which of the following is the best reason for removing data outliers?

Options:

A.

Data varies significantly from others.

B.

Data is redundant in the table.

C.

Data is duplicated in the whole range.

D.

Data is missing from the table.

Buy Now
Questions 15

After a merger, an analyst needs to enhance a very complicated quarterly report so that it is more user friendly for new team members. Which of the following elements would help reduce questions?

Options:

A.

Version details

B.

Appendix

C.

Reference data sources

D.

FAQs

Buy Now
Questions 16

A county in Illinois is conducting a survey to determine the mean annual income per household. The county is 427sq mi (2.65q km). Which of the following sampling methods would MOST likely result in a representative sample?

Options:

A.

A stratified phone survey of 100 people that is conducted between 2:00 p.m. and 3:00 p.m.

B.

A systematic survey that is sent to 100 single-family homes in the county

C.

Surveys sent to ten randomly selected homes within 5mi (8km) of the county’s office

D.

Surveys sent to 100 randomly selected homes that are reflective of the population

Buy Now
Questions 17

Which of the following describes the use of a representative amount of data from a main repository?

Options:

A.

Observation

B.

Delta load

C.

Web scraping

D.

Sampling

Buy Now
Questions 18

A column is being used to store strings of variable lengths. Performance is a concern, so the column needs to use as little space as possible. Which of the following data types best meets these requirements?

Options:

A.

char

B.

nchar

C.

varchar

D.

nvarchar

Buy Now
Questions 19

Which of the following is a non-parametric test?

Options:

A.

One-sample t-test

B.

Two-way ANOVA

C.

Correlation coefficient

D.

Spearman's rank correlation

Buy Now
Questions 20

Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?

Options:

A.

To return a subset of records

B.

To insert a temporary table

C.

To prevent SQL injections

D.

To increase the query speed

Buy Now
Questions 21

Given the customer table below:

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

Options:

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Buy Now
Questions 22

Which of the following describes the method of sampling in which elements of data are selected randomly from each of the small subgroups within a population?

Options:

A.

Simple random

B.

Cluster

C.

Systematic

D.

Stratified

Buy Now
Questions 23

Which of the following best describe qualitative data? (Select two).

Options:

A.

Discrete

B.

Ordinal

C.

Batch

D.

Continuous

E.

Nominal

F.

Real-time

Buy Now
Questions 24

An analyst wants to extract data from a variety of sources and store the data in a cloud-based environment prior to cleaning. Which of the following integration techniques should the analyst use?

Options:

A.

ETL

B.

API

C.

SQL

D.

ELT

Buy Now
Questions 25

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

Customer Table -

In-store Transactions –

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

Options:

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Buy Now
Questions 26

Which of the following can be used to translate data into another form so it can only be read by a user who has a key or a password?

Options:

A.

Data encryption.

B.

Data transmission.

C.

Data protection.

D.

Data masking.

Buy Now
Questions 27

Which of the following techniques is used to quantify data?

Options:

A.

Decoding

B.

Enumeration

C.

Coding

D.

Structure

Buy Now
Questions 28

The process of performing initial investigations on data to spot outliers, discover patterns, and test assumptions with statistical insight and graphical visualization is called:

Options:

A.

a t-test.

B.

a performance analysis.

C.

an exploratory data analysis.

D.

a link analysis.

Buy Now
Questions 29

An analyst compiled a high-level report that includes the following data points:

    Total dollars closed for the year

    Annual quota/goal

    Top 10 customers

    Average deal size

    Largest deals lost

Which of the following groups is the most likely audience for this report?

Options:

A.

External vendors

B.

General public

C.

Lower-level managers

D.

C-suite officers

Buy Now
Questions 30

An employer needs to maintain adequate office staffing during the winter and wants to track storm data. Which of the following data collection methods should the employer use?

Options:

A.

Web scraping

B.

Public databases

C.

Observations

D.

Weather surveys

Buy Now
Questions 31

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

Options:

A.

ETL process.

B.

Record linkage.

C.

ELT process.

D.

System integration.

Buy Now
Questions 32

A recurring event is being stored in two databases that are housed in different geographical locations. A data analyst notices the event is being logged three hours earlier in one database than in the other database. Which of the following is the MOST likely cause of the issue?

Options:

A.

The data analyst is not querying the databases correctly.

B.

The databases are recording different events.

C.

The databases are recording the event in different time zones.

D.

The second database is logging incorrectly.

Buy Now
Questions 33

A reporting analyst needs to create a report that refreshes automatically and is accessible to the entire sales organization. Which of the following tools is the most appropriate to use for this task?

Options:

A.

R

B.

Excel

C.

Tableau

D.

Python

Buy Now
Questions 34

An analyst wants to create a historical data set for the past five years with each year in its own data set. Which of the following methods is the best way to create this historical data set?

Options:

A.

Data transpose

B.

Data concatenation

C.

Data append

D.

Data normalization

Buy Now
Questions 35

Joe. an analyst. tests the loading time on a dashboard he is preparing to go live and finds it is slower than he would like. Which of the following must occur to decrease the loading time?

Options:

A.

Deploy the dashboard to production.

B.

Change the field definitions.

C.

Update the dashboard subscribers.

D.

Optimize the dashboard.

Buy Now
Questions 36

Which of the following is the best approach to use to gain a general understanding of a data set?

Options:

A.

Descriptive statistics

B.

Basic projections

C.

Gap analysis

D.

Trend analysis

Buy Now
Questions 37

Each month an analyst needs to execute a data pull for the two prior months. Which of the following is the most efficient function for the analyst to use?

Options:

A.

Logical

B.

Date

C.

Aggregate

D.

System

Buy Now
Questions 38

An organization would like to add a secondary email field to its customer database in order toenrich the customer profiles. Which of the following data manipulation techniques should the analyst use to add this information?

Options:

A.

Blend

B.

Merge

C.

Append

D.

Aggregate

Buy Now
Questions 39

Given the following table:

Date of visit

Age

Gender

6/1/22

30

Male

6/15/22

65F

Fem.

6/19/2022

24

M

Which of the following describes the data quality issues with the age data?

Options:

A.

Completeness

B.

Consistency

C.

Accuracy

D.

Manipulation

Buy Now
Questions 40

Which of the following report types is most appropriate for a high-level, year-end report requested by a Chief Executive Officer?

Options:

A.

Dynamic

B.

Recurring

C.

Ad hoc

D.

Self-service

Buy Now
Questions 41

Given the following table of student scores (with some values that violate the allowed scoring rules), which of the following is the best reason for cleansing the data?

Options:

A.

Invalid data

B.

Redundant data

C.

Data outliers

D.

Missing data

Buy Now
Questions 42

You are working with a dataset and want to change the names of categories that you used fordifferent types of books.

What term best describes this action?

Options:

A.

Recording.

B.

Summarizing

C.

Aggregating.

D.

Filtering.

Buy Now
Questions 43

A business unit made the following modification to the values in a table:

Which of the following data quality dimensions was applied in this scenario?

Options:

A.

Integrity

B.

Consistency

C.

Completeness

D.

Accuracy

Buy Now
Questions 44

An analyst is reviewing the following data:

Car IDSpeed

123155

566436

564418

650567

546436

645638

Which of the following should the analyst include in the measures of central tendency for speed?

Options:

A.

Mode = 38 Range = 31 Mean = 42.5

B.

Range = 49 Max = 67 Min = 18

C.

Mode = 36 Max = 67 Min = 18

D.

Mode = 36 Median = 37 Mean = 41.5

Buy Now
Questions 45

A user receives a large custom report to track company sales across various date ranges. The user then completes a series of manual calculations for each date range. Which of the following should an analyst suggest so the user has a dynamic, seamless experience?

Options:

A.

Create multiple reports, one for each needed date range.

B.

Build calculations into the report so they are done automatically.

C.

Add macros to the report to speed up the filtering and calculations process.

D.

Create a dashboard with a date range picker and calculations built in.

Buy Now
Questions 46

A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should theanalyst recommend?

Options:

A.

A self-service report

B.

A research report

C.

An ad hoc report

D.

An operational report

Buy Now
Questions 47

A survey asks participants to rate a company on a scale of one to ten. Which of the following best describes the rating variable?

Options:

A.

Continuous

B.

Ordinal

C.

Categorical

D.

Nominal

Buy Now
Questions 48

A data analyst has removed the outliers from a data set due to large variances. Which of the following central tendencies would be the best measure to use?

Options:

A.

Range

B.

Mean

C.

Mode

D.

Median

Buy Now
Questions 49

A data analyst is setting up a data dashboard to monitor several ETL data streams to ensure that data is complete for later analysis. Which of the following audiences should the analyst target for this dashboard?

Options:

A.

Executives

B.

The management team

C.

Technical experts

D.

External vendors

Buy Now
Questions 50

A data analyst is working with a team to create a dashboard for a client who requires on-demand access. Which of the following is the best delivery method to support the clients’ requirement?

Options:

A.

Email

B.

Scheduled

C.

Subscription

D.

Static

Buy Now
Questions 51

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D. over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Buy Now
Questions 52

Which of the following are reasons to conduct data cleansing? (Select two).

Options:

A.

To perform web scraping

B.

To track KPls

C.

To improve accuracy

D.

To review data sets

E.

To increase the sample size

F.

To calculate trends

Buy Now
Questions 53

Given the following graph:

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D, over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Buy Now
Questions 54

Which of the following best describes the use of a tab sequence?

Options:

A.

\t

B.

\\t

C.

\l

D.

\\l

Buy Now
Questions 55

Given the image below:

The data should be cleaned because of the presence of:

Options:

A.

outlier

B.

non-parametric data.

C.

multicollinearity.

D.

invalid data.

Buy Now
Questions 56

Which of the following best describes an exploratory analysis?

Options:

A.

Involves the use of descriptive statistics to understand observations

B.

Involves analysis of exploring data sets for performance tracking

C.

Involves the testing of specific hypotheses

D.

Involves the use of arithmetic algebra to determine the distribution

Buy Now
Questions 57

Given the following:

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?

Options:

A.

Fill in the missing cost where it is null.

B.

Separate the table into two tables and create a primary key

C.

Replace the extended cost field with a calculated field.

D.

Correct the dates so they have the same format.

Buy Now
Questions 58

Taylor wants to investigate how manufacturing, marketing, and sales expenditures impact overall profitability for her company.

Which of the following systems is the most appropriate?

Options:

A.

OLTP.

B.

OLAP.

C.

Data warehouse.

D.

Data mart.

Buy Now
Questions 59

An analyst wants to determine whether a relationship between an individual's age and voting preferences exists. Which of the following is the best statistical method for the analyst to use?

Options:

A.

P-value

B.

Chi-squared

C.

F-test

D.

Z-score

Buy Now
Questions 60

A sales manager requested a report that contains the first name, last name, and phone number of all of the company's customers and employees. The data engineer needs to return all the records from several tables, even duplicates. Which of the following is the best way to join the two tables?

Options:

A.

FULL OUTER JOIN

B.

FULL INNER JOIN

C.

LEFT OUTER JOIN

D.

CROSS JOIN

Buy Now
Questions 61

An analyst in a consumer bank department wants to showcase the concentration of accounts opened in the United States by ZIP Code to describe the effectiveness of the bank's marketing campaigns. Which of the following would be the best way to visualize the data?

Options:

A.

A stacked chart

B.

A tree map

C.

A waterfall chart

D.

A geographic map

Buy Now
Questions 62

A database administrator is required to mask certain table columns containing PII in order to comply with the company privacy policy. Which of the following are the most likely types of information the administrator should mask? (Select two).

Options:

A.

Government-issued ID

B.

Address

C.

Order ID

D.

Order date

E.

Customer ID

F.

Referral number

Buy Now
Questions 63

Which of the following data types would a telephone number formatted as XXX-XXX-XXXX be considered?

Options:

A.

Numeric

B.

Date

C.

Float

D.

Text

Buy Now
Questions 64

A data analyst has received a data set that contains actual and projected sales for the fourth quarter of 2019. Which of the following statistical methods should the analyst use to find the measure of dispersion?

Options:

A.

Mean

B.

Variance

C.

Correlation

D.

Confidence interval

Buy Now
Questions 65

Which of the following is a process that is used during data integration to collect, blend, and load data?

Options:

A.

MDM

B.

ETL

C.

OLTP

D.

BI

Buy Now
Questions 66

A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:

Options:

A.

Non-relational schema

B.

Galaxy schema

C.

Snowflake schema

D.

Star schema

Buy Now
Questions 67

Which of the following BEST describes the issue in which character values are mixed with integer values in a data set column?

Options:

A.

Duplicate data

B.

Missing data

C.

Data outliers

D.

Invalid data type

Buy Now
Questions 68

An analyst reviews the following data:

7

3

5

2

3

7

7

10

Which of the following is the value of the mode?

Options:

A.

3

B.

5

C.

7

D.

10

Buy Now
Questions 69

Which of the following best describes a difference between JSON and XML?

Options:

A.

JSON is quicker to read and write.

B.

JSON has to use an end tag.

C.

JSON strings are longer

D.

JSON is much more difficult to parse.

Buy Now
Questions 70

An analyst is building a new dashboard for a user. After an initial conversation with the user. the analyst created a mock-up of the dashboard. Which of the following best explains why the analyst created the mock-up?

Options:

A.

To identify the dimensions and measures

B.

To send to the client after deploying the dashboard to production

C.

To confirm important details before dashboard development begins

D.

To receive client approval for the final dashboard design

Buy Now
Questions 71

An analyst is explaining the company’s financial systems and reporting tools to a new coworker. Which of the following data quality dimensions are the most important? (Select three).

Options:

A.

Data formatting

B.

Data accuracy

C.

Data maturity

D.

Data field

E.

Data completeness

F.

Data consistency

G.

Data diversity

Buy Now
Questions 72

A Chief Executive Officer (CEO) is requesting more up-to-date sales data for improved visibility prior to month-end. An analyst must determine the frequency of a sales report that was previously distributed on an as-needed basis. Which of the following would be the most appropriate frequency for this report?

Options:

A.

Monthly

B.

Quarterly

C.

Weekly

D.

Every other month

Buy Now
Questions 73

A research analyst collects ten data points from 1.000 specimens. The analyst will not need any additional data to complete the analysis and will not need to retrieve information by specifier. Which of the following is the best data structure for the analyst to use?

Options:

A.

NoSQL

B.

Flat file

C.

JSON

D.

Relational database

Buy Now
Questions 74

A data analyst is attempting to understand how ice cream consumption is affected by different attributes. such as cost, temperature. and income level. Which of the following

regression analyses should the data analyst perform to understand this relationship?

Options:

A.

Logistic

B.

Ordinary least squares

C.

Cox

D.

Polynomial

Buy Now
Questions 75

Which of the following is the first step an analyst should perform upon receiving a business request for analysis?

Options:

A.

Determine the data needs and sources for analysis.

B.

Initiate the analysis for exploratory data analysis.

C.

Review the business questions to understand the scope.

D.

Finalize the methodology to solve the problem.

Buy Now
Questions 76

A data analyst needs to present the results of an online marketing campaign to the marketing manager. The manager wants to see the most important KPIs and measure the return on marketing investment. Which of the following should the data analyst use to BEST communicate this information to the manager?

Options:

A.

A real-time monitor that allows the manager to view performance the day the campaign was launched

B.

A sell-service dashboard that allows the manager to look at the company's annual budget performance

C.

A spreadsheet of the raw data from all marketing campaigns and channels

D.

A summary with statistics, conclusions, and recommendations from the data analyst

Buy Now
Questions 77

Given the following table:

Which of the following describes the data quality issues with theagedata?

Options:

A.

Completeness

B.

Consistency

C.

Accuracy

D.

Manipulation

Buy Now
Questions 78

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

Options:

A.

SAS

B.

Microsoft Power BI

C.

IBM SPSS

D.

Python

Buy Now
Questions 79

A quality assurance manager is examining tolerances in Internet of Things sensors. Which of the following is the best measure for the manager to calculate?

Options:

A.

Standard deviation

B.

Quartile range

C.

Median

D.

Mean

Buy Now
Questions 80

A data set for sales per month includes the following data:

Which of the following cleaning and profiling methods should be applied to the data set?

Options:

A.

Data outliers

B.

Invalid data

C.

Duplicate data

D.

Data type validation

Buy Now
Questions 81

An analyst reviews the following table:

Which of the following data types is represented in the values in the RefNo column?

Options:

A.

Numeric

B.

Real Number

C.

Currency

D.

Alphanumeric

Buy Now
Questions 82

‘Which of the following is the BEST reason to use database views instead of tables?

Options:

A.

Views reduce the need for repetitive, complex data joins.

B.

Views allow for the storage of temporary data. whereas tables do not.

C.

Views allow for the joining of multiple data sources, whereas tables do not.

D.

Views can be used to restrict sensitive information.

Buy Now
Questions 83

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Buy Now
Questions 84

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

Options:

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Buy Now
Questions 85

Given the following tables:

Which of the following will be the dimensions from a FULL JOIN of the tables above?

Options:

A.

Two rows and three columns

B.

Three rows and four columns

C.

Four rows and two columns

D.

Four rows and four columns

Buy Now
Questions 86

A data analyst is building a closed won quarter-over-quarter report for the sales team. Which of the following will be needed to complete this request?

Options:

A.

The report create date and closed dollar amount

B.

The closed won quarter and the closed dollar amount

C.

The segment and closed dollar amount

D.

The closed won year and sales leader name

Buy Now
Questions 87

A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:

Options:

A.

non-relational schema.

B.

galaxy schema.

C.

snowflake schema.

D.

star schema.

Buy Now
Questions 88

An analyst needs to conduct a quick analysis. Which of the following is the FIRST step the analyst should perform with the data?

Options:

A.

Conduct an exploratory analysis and use descriptive statistics.

B.

Conduct a trend analysis and use a scatter chart.

C.

Conduct a link analysis and illustrate the connection points.

D.

Conduct an initial analysis and use a Pareto chart.

Buy Now
Questions 89

Daniel is using the structured Query language to work with data stored in relational database.

He would like to add several new rows to a database table.

What command should he use?

Options:

A.

SELECT.

B.

ALTER.

C.

INSERT.

D.

UPDATE.

Buy Now
Questions 90

An analyst needs to determine the appropriate data type for the following sample data:

sample data collected:

Which of the following data types should be used for this data?

Options:

A.

Text

B.

Float

C.

Alphanumeric

D.

Numeric

Buy Now
Questions 91

An analyst has written the following code:

SELECT *

FROM Cust_table

WHERE age > 60 AND City = "New York"

Which of the following criteria is the analyst retrieving?

Options:

A.

All customers older than age 60 in New York state

B.

All customers aged 60 and older in New York state

C.

All customers older than age 60 in New York City

D.

All customers younger than age 60 in New York City

Buy Now
Questions 92

Which one of the following programming languages is specifically designed for use in analytics applications?

Options:

A.

Python.

B.

R

C.

C++

D.

Java.

Buy Now
Questions 93

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered?

Options:

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Buy Now
Questions 94

Which of the following types of analyses should be used to evaluate the connections and anomalies in a data set when either known patterns are being violated or new patterns are emerging?

Options:

A.

Correlation

B.

Descriptive

C.

Graph

D.

Regression

Buy Now
Questions 95

A data profiling rule checks the quality of all email addresses in a database. The rule returns a value with the number of email addresses that conformed to the rule. Which of the following options describes this value?

Options:

A.

Columns passed

B.

Rows passed

C.

Rows failed

D.

Columns failed

Buy Now
Questions 96

Given the following grocery store orders:

If a query is made to the table with the following logic:

Order_Total > 132 OR (Order Total >= 25 AND Order_Total < 74)

Which of the following is the number of orders that will be returned by the query?

Options:

A.

Four

B.

Five

C.

Six

D.

Seven

Buy Now
Questions 97

A JSON file is an example of:

Options:

A.

structured data.

B.

web data.

C.

machine data.

D.

processed data.

Buy Now
Questions 98

Which of the following concepts should be applied if a data set with 40 fields needs to be pared down to 20 fields and contains similar data across multiple fields?

Options:

A.

Duplication

B.

Consolidation

C.

Compliance

D.

Standardization

Buy Now
Questions 99

Jhon is working on an ELT process that sources data from six different source systems.

Looking at the source data, he finds that data about the sample people exists in two of six systems.

What does he have to make sure he checks for in his ELT process?

Choose the best answer.

Options:

A.

Duplicate Data.

B.

Redundant Data.

C.

Invalid Data.

D.

Missing Data.

Buy Now
Questions 100

An analyst must obtain the average daily sales for the following week:

Which of the following must the analyst perform to obtain this value?

Options:

A.

Data normalization

B.

Data append

C.

Data aggregation

D.

Data blending

Buy Now
Questions 101

An analysts building a monthly report for production and wants to ensure the audience is aware of its once-a-month cadence. Which of the following is the MOST important to convey that information?

Options:

A.

The date of the dashboard build

B.

The data refresh date

C.

A report summary

D.

Frequently asked questions

Buy Now
Questions 102

Which of the following is an example of a data-mining ETL tool?

Options:

A.

SSIS

B.

Stata

C.

SPSS

D.

Cognos

Buy Now
Questions 103

A dataset requires an analysis for investigating and discovering abnormalities. Which of the following best describes the nature of the exploratory analysis conducted?

Options:

A.

Summary of the data's main characteristics

B.

Best data tuning method

C.

Set of methods for cleaning the data

D.

Method of checking the quality of the data

Buy Now
Questions 104

Consider the following dataset which contains information about houses that are for sale:

Which of the following string manipulation commands will combine the address and region namecolumns to create a full address?

full_address------------------------- 85 Turner St, Northern Metropolitan 25 Bloomburg St, Northern Metropolitan 5 Charles St, Northern Metropolitan 40 Federation La, Northern Metropolitan 55a Park St, Northern Metropolitan

Options:

A.

SELECT CONCAT(address, ' , ' , regionname) AS full_address FROM melb LIMIT 5;

B.

SELECT CONCAT(address, '-' , regionname) AS full_address FROM melb LIMIT 5;

C.

SELECT CONCAT(regionname, ' , ' , address) AS full_address FROM melb LIMIT 5

D.

SELECT CONCAT(regionname, '-' , address) AS full_address FROM melb LIMIT 5;

Buy Now
Questions 105

Which of the following is the best description of discrete data types?

Options:

A.

Non-numeric data used to describe attributes of a population sample

B.

The frequency of the number of times each value occurs by using whole numbers

C.

Numeric values that can be measured on a continuous scale

D.

Non-numeric data used to describe attributes of a population sample ranked in a specific order

Buy Now
Questions 106

You are working with a dataset and need to swap the values in rows with those in columns.

What action do you need to perform?

Options:

A.

Recording

B.

Filtering.

C.

Aggregation.

D.

Transposition.

Buy Now
Questions 107

Which of the following BEST describes standard deviation?

Options:

A.

A measure that is used to establish a relationship between two variables

B.

A measure of how data is distributed

C.

A measure of the amount of dispersion of a set of values

D.

A measure that is used to find the significant difference between variables

Buy Now
Questions 108

A data analyst is working with a data set and would like to combine two fields into a single field. Which of the following data manipulation techniques should the analyst use?

Options:

A.

Data merge

B.

Transpose

C.

Data append

D.

Concatenation

Buy Now
Questions 109

Which of the following programming languages are best suited for analysis and machine-learning applications? (Select two).

Options:

A.

Ruby

B.

Rust

C.

PHP

D.

Python

E.

Kotlin

F.

R

Buy Now
Questions 110

Emma is working in a data warehouse and finds a finance fact table links to an organization dimension, which in turn links to a currency dimension that not linked to the fact table.

What type of design pattern is the data warehouse using?

Options:

A.

Star.

B.

Sun.

C.

Snowflake.

D.

Comet.

Buy Now
Questions 111

Which of the following differentiates a flat text file from other data types?

Options:

A.

Data is separated by a delimiter.

B.

Data is stored in defined rows.

C.

Data is defined with key-value pairs.

D.

Data is housed in a markup language.

Buy Now
Questions 112

Which of the following is a control measure for preventing a data breach?

Options:

A.

Data transmission

B.

Data attribution

C.

Data retention

D.

Data encryption

Buy Now
Questions 113

Which of the following is an example of a discrete data type?

Options:

A.

8in (20cm)

B.

5 kids

C.

2.5mi (4km)

D.

10.7lbs (4.9kg)

Buy Now
Questions 114

An analyst has conducted a review of business questions. Which of the following should the analyst do next to conduct an analysis?

Options:

A.

Determine the data needs and review the observations.

B.

Determine the data needs and sources for analysis.

C.

Determine the data needs and schedule interviews.

D.

Determine the data needs and begin the analysis.

Buy Now
Questions 115

An analyst wants to check the progress and performance regarding the number of customers an organization served in the last six years. Which of the following represents the type of analysis theanalyst should perform?

Options:

A.

Correlation analysis

B.

Trend analysis

C.

Regression analysis

D.

Descriptive analysis

Buy Now
Questions 116

An analyst for a concert venue is analyzing the number of tickets sold for a recent event. Which of the following types of data is the number of sold tickets an example of?

Options:

A.

Ordinal

B.

Continuous

C.

Nominal

D.

Discrete

Buy Now
Questions 117

Which of the following defines the policies and procedures for managing the master data?

Options:

A.

Data administration

B.

Data stewardship

C.

Data ownership

D.

Data governance

Buy Now
Questions 118

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

Options:

A.

SAS

B.

Microsoft Power B1

C.

IBM SPSS

D.

Python

Buy Now
Exam Code: DA0-001
Exam Name: CompTIA Data+ Certification Exam
Last Update: Nov 22, 2025
Questions: 396
$57.75  $164.99
$43.75  $124.99
$36.75  $104.99
buy now DA0-001