Try 10 focused Python Institute PCEI questions on machine-learning fundamentals, with explanations, then continue with IT Mastery.
Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.
Try Python Institute PCEI on Web View full Python Institute PCEI practice page
| Field | Detail |
|---|---|
| Exam route | Python Institute PCEI |
| Topic area | Block 2: Machine Learning Fundamentals |
| Blueprint weight | 16.5% |
| Page purpose | Focused sample questions before returning to mixed practice |
Use this page to isolate Block 2: Machine Learning Fundamentals for Python Institute PCEI. Work through the 10 questions first, then review the explanations and return to mixed practice in IT Mastery.
| Pass | What to do | What to record |
|---|---|---|
| First attempt | Answer without checking the explanation first. | The fact, rule, calculation, or judgment point that controlled your answer. |
| Review | Read the explanation even when you were correct. | Why the best answer is stronger than the closest distractor. |
| Repair | Repeat only missed or uncertain items after a short break. | The pattern behind misses, not the answer letter. |
| Transfer | Return to mixed practice once the topic feels stable. | Whether the same skill holds up when the topic is no longer obvious. |
Blueprint context: 16.5% of the practice outline. A focused topic score can overstate readiness if you recognize the pattern too quickly, so use it as repair work before timed mixed sets.
These questions are original IT Mastery practice items aligned to this topic area. They are designed for self-assessment and are not official exam questions.
Topic: Block 2: Machine Learning Fundamentals
A beginner classifier uses nearest-neighbor classification with k = 1. A smaller distance means the training example is more similar to the new message.
Exhibit: Distances to the new message
| Training message | Known label | Distance |
|---|---|---|
| Message A | not spam | 3.2 |
| Message B | spam | 1.1 |
| Message C | not spam | 2.4 |
| Message D | spam | 4.0 |
Which label should the classifier assign to the new message?
Options:
A. Assign not spam
B. Assign spam
C. Average the labels first
D. Wait for more training rows
Best answer: B
Explanation: Nearest-neighbor classification assigns a label by comparing a new item to labeled training examples. With k = 1, only the single closest training example matters. In the exhibit, the smallest distance is 1.1 for Message B, so the new message receives Message B’s known label. The other distances are larger and do not affect the decision when k = 1.
The key takeaway is that smaller distance means greater similarity, and k = 1 uses only the nearest labeled example.
k = 1 decision.Topic: Block 2: Machine Learning Fundamentals
A small clinic has 800 past appointment records. Each record includes features such as day of week, appointment type, and patient age group, plus a known label: missed or attended. The team wants a Python model that predicts missed or attended for new appointments. A coworker proposes using k-means clustering because it can group similar records. What is the best judgment about this proposal?
Options:
A. Use linear regression because the model predicts the future
B. Use a supervised classification algorithm instead
C. Use k-means because the records have similar features
D. Remove the labels so k-means can learn without bias
Best answer: B
Explanation: This is an algorithm-fit question. The clinic already has labeled examples (missed or attended) and wants the same kind of label for new records. That is a supervised classification task: the model learns from input features paired with known class labels. K-means clustering is unsupervised and is used to find groups when labels are not provided or when the goal is exploration, not direct prediction of a known class. Regression is also mismatched because the desired output is not a continuous number. The key takeaway is to match the algorithm family to the available labels and the desired output type.
Topic: Block 2: Machine Learning Fundamentals
A beginner ML script compares a new point with one known point using Euclidean distance. The formula is \(d = \sqrt{(x_2-x_1)^2 + (y_2-y_1)^2}\). Select ONE result that follows.
import math
new_point = (2, 3)
known_point = (5, 7)
distance = math.sqrt((5 - 2)**2 + (7 - 3)**2)
print(distance)
Options:
A. 25.0
B. 7.0
C. 5.0
D. 1.0
Best answer: C
Explanation: Euclidean distance is the straight-line distance between two numeric points. For the points (2, 3) and (5, 7), the x-values differ by 3 and the y-values differ by 4. Squaring those differences gives 9 and 16, and their sum is 25. The square root of 25 is 5. Because math.sqrt() returns a floating-point number in Python, the printed output is shown with a decimal as 5.0. The key idea is to square the coordinate differences before adding them, then take the square root.
Topic: Block 2: Machine Learning Fundamentals
A beginner AI project has this note:
System: warehouse robot simulator
Behavior: tries different paths to a pickup point
Feedback: +5 for reaching the item, -2 for hitting a wall
Result: after many trials, it chooses shorter paths more often
Select ONE: Which type of machine learning does this example best illustrate?
Options:
A. Reinforcement learning
B. Rule-based programming
C. Supervised learning
D. Unsupervised learning
Best answer: A
Explanation: Reinforcement learning is used when an agent learns by taking actions in an environment and receiving feedback as rewards or penalties. In the note, the robot simulator tries paths, receives positive feedback for reaching the item, and receives negative feedback for hitting a wall. Over repeated trials, it changes its behavior toward better paths.
Supervised learning would require labeled examples, such as paths already marked as “good” or “bad.” Unsupervised learning would look for patterns in unlabeled data, not learn from action-based rewards.
Topic: Block 2: Machine Learning Fundamentals
A beginner ML exercise uses fixed thresholds to classify machine readings before any model is trained. Interpret the Python logic shown.
readings = [
{"id": "A", "temp": 72, "vibration": 4},
{"id": "B", "temp": 66, "vibration": 6},
{"id": "C", "temp": 81, "vibration": 3},
]
def classify(r):
if r["temp"] >= 80 or r["vibration"] >= 7:
return "urgent"
elif r["temp"] >= 70 or r["vibration"] >= 5:
return "watch"
else:
return "normal"
labels = {r["id"]: classify(r) for r in readings}
print(labels)
Which output is produced?
Options:
A. {'A': 'urgent', 'B': 'watch', 'C': 'urgent'}
B. {'A': 'watch', 'B': 'normal', 'C': 'urgent'}
C. {'A': 'watch', 'B': 'watch', 'C': 'urgent'}
D. {'A': 'normal', 'B': 'watch', 'C': 'urgent'}
Best answer: C
Explanation: Rule-based classification applies the first condition whose threshold test is true. The urgent rule is checked first and requires temp >= 80 or vibration >= 7. Reading C has temp 81, so it is urgent. Readings A and B do not meet the urgent thresholds, so Python checks the elif: A has temp 72, and B has vibration 6, so both are watch. The else branch is used only when neither threshold group is met.
temp 72 satisfies the watch threshold.vibration 6 satisfies the watch threshold.temp >= 80 nor vibration >= 7 is true for A.Topic: Block 2: Machine Learning Fundamentals
A beginner AI team records this project note:
Goal: Group customers with similar buying patterns.
Data: Past purchases and visit counts for each customer.
Provided answers: No category names or target labels are included.
Expected result: Customer groups for later marketing review.
Which type of machine learning does this task describe?
Options:
A. Supervised learning
B. Rule-based classification
C. Unsupervised learning
D. Reinforcement learning
Best answer: C
Explanation: Unsupervised learning is used when the data has inputs but no provided correct answers, labels, or target values. In this note, the team wants the model to discover customer groups from purchase and visit patterns. Because the expected result is a set of groups rather than predictions against known labels, this is a clustering-style unsupervised task. Supervised learning would require examples such as “customer type = budget buyer,” and reinforcement learning would involve an agent learning from rewards after actions.
Topic: Block 2: Machine Learning Fundamentals
A team is preparing a small supervised ML dataset to predict support-ticket priority. The priority label must be one of Low, Medium, or High, and days_open must be numeric. Before splitting the data into training and test sets, you inspect this sample:
ticket_id,days_open,customer_tier,priority
T101,2,Gold,High
T102,,Silver,Low
T103,three,Bronze,Medium
T104,5,Gold,High
T104,5,Gold,Low
T105,999,Silver,Urgent
Which is the best next action?
Options:
A. Convert every column to text so all values have the same type
B. Investigate and clean the missing value, type error, outlier, duplicate, and invalid label
C. Remove only the repeated ticket_id row and keep the rest unchanged
D. Split the data now because the model can learn around noisy rows
Best answer: B
Explanation: Data-quality checks should happen before training and usually before the final train/test split, so the team understands what data the model will learn from. This sample has several common issues: a missing days_open value, a nonnumeric value (three) in a numeric field, a likely outlier (999 days), a duplicate ticket_id with conflicting labels, and an invalid label (Urgent) outside the allowed label set. The best action is to investigate and clean or document these issues using consistent rules, rather than letting them silently affect training or evaluation. Cleaning only one issue would leave other problems that can distort model behavior and metrics.
days_open is valid numeric data.Topic: Block 2: Machine Learning Fundamentals
A beginner ML team is building a model to classify support tickets as billing, technical, or account.
Workflow note:
1. Exported 2,000 past support tickets from the help desk system
2. Removed duplicate tickets and fixed missing category labels
3. Used the cleaned labeled tickets to build a classification model
According to the basic machine learning workflow, what should the team do next?
Options:
A. Clean the same labels again
B. Use the model for live inference immediately
C. Evaluate the model on test data
D. Collect the original tickets again
Best answer: C
Explanation: The basic machine learning workflow usually follows this order: data collection, data cleaning, training, evaluation, and inference. In the note, the team has already collected past tickets, cleaned the dataset, and trained a classification model. The next step is evaluation: checking the trained model on data not used for training to estimate how well it performs. Only after evaluation shows acceptable results should the team use the model for inference, such as classifying new live tickets. The key distinction is that training builds the model, while evaluation checks whether the trained model is reliable enough to use.
Topic: Block 2: Machine Learning Fundamentals
A beginner model checks products for defects. The positive class is defect.
| Item | Actual class | Model prediction |
|---|---|---|
| A | defect | defect |
| B | no defect | defect |
| C | defect | no defect |
| D | no defect | no defect |
Which statement correctly interprets the model results? Select ONE.
Options:
A. A is a false positive; D is a false negative.
B. B is a false negative; C is a false positive.
C. B is a false positive; C is a false negative.
D. A is a true negative; D is a true positive.
Best answer: C
Explanation: In classification metrics, “positive” means the class being detected: here, defect. A false positive happens when the model predicts the positive class but the actual class is negative. Item B is actually no defect but was predicted as defect, so it is a false positive. A false negative happens when the model predicts the negative class but the actual class is positive. Item C is actually defect but was predicted as no defect, so it is a false negative.
The key is to compare each prediction with the actual class and keep track of which class is defined as positive.
defect.Topic: Block 2: Machine Learning Fundamentals
A beginner team wants to predict whether a new support ticket should be assigned to billing or technical. They propose using k-means on old tickets.
Project note:
Old data fields: word_count, has_invoice_number, has_error_code, assigned_team
Example assigned_team values: billing, technical
Proposed output needed: billing or technical
Proposed algorithm: k-means clustering
What does this evidence show? Select ONE.
Options:
A. The algorithm is well matched because k-means predicts labels directly.
B. The label column should be removed because it prevents learning.
C. The task should use regression because there are two possible outputs.
D. The algorithm is mismatched; use supervised classification.
Best answer: D
Explanation: This is an algorithm-fit issue. The project already has labeled examples in assigned_team, and the desired result for a new ticket is one of the known categories: billing or technical. That makes the task supervised classification. K-means is an unsupervised clustering algorithm; it groups similar records but does not learn from the provided class labels or directly predict meaningful category names. A classifier such as a decision tree, k-nearest neighbors classifier, or similar supervised method would match the task better.
The key takeaway is to match the algorithm family to the data and output: labeled category prediction calls for classification, not clustering.
Use the Python Institute PCEI Practice Test page for the full IT Mastery practice bank, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.
Try Python Institute PCEI on Web View Python Institute PCEI Practice Test
Read the Python Institute PCEI Cheat Sheet for compact concept review before returning to timed practice.