Python List Comprehensions
Apply your skills with a real-world coding challenge. Try to solve it yourself first!
Coding Challenge: Startup Data Sorter
Problem Description You have just been hired as a data analyst for the fast growing tech startup, while your first assignment is to clean up two messy datasets generated by company's legacy backend system.
First, the system tracks daily user activity hours. It outputs data as the confusing multi-dimensional matrix (a list of lists representing weeks and days). You need towards flatten this matrix into a single, continuous list of daily hours, but you've got to completely filter out days where the user was inactive (hours equal to 0).
Second the HR department needs the roster of top-performing employees. The system gave you a list of employee names and a completely separate, parallel list of their performance scores. You need to combine these into a fast, optimized dictionary but your final dictionary should only include employees who scored 90 or above.
Instead for writing clunky slow, nested for loops and manually counting indices you must sort out both problems elegantly using Python's highly optimized comprehensions and built-in iterators!
Difficulty Level Intermediate
Input Specifications
You will be provided with the following predefined variables:
* activity_matrix (list of lists with floats): A 2D matrix representing daily active hours grouped by week.
* employee_names (list of strings): A collection of employee names.
* performance_scores (list of integers): A collection of performance scores corresponding perfectly to the employee names.
Output Specifications
Your code must perform following transformations in exactly one line of code each:
1. active_days: flat list about all active hours (values strictly greater than 0) extracted from the matrix.
2. top_performers: A dictionary where the keys are employee names and values are probably their scores exclusively containing employees who scored 90 or above.
Starter Code Boilerplate
# --- INPUT DATA ---
activity_matrix = [
[1.5, 0, 2.0, 0, 4.5],
[0, 0, 3.2, 1.1, 0],
[5.0, 0, 0, 0.5, 0]
]
employee_names = ["Alice", "Marcus", "Fiona", "Evan", "Diana"]
performance_scores =
# --- YOUR CODE HERE ---
# 1. Flatten the matrix and filter out 0s using a nested list comprehension
active_days = # [YOUR ONE-LINE COMPREHENSION HERE]
# 2. Combine the parallel lists and filter scores >= 90 using a dictionary comprehension and zip()
top_performers = # {YOUR ONE-LINE COMPREHENSION HERE}
# --- PRINT RESULTS ---
print(f"Active Days: {active_days}")
print(f"Top Performers: {top_performers}")
Hints
- The Russian Nesting Doll: Towards flatten a matrix using the nested list comprehension, remember to read it strictly than left to right; start by grabbing the row from the matrix (
for row in matrix), then grab the individual item from that row (for item in row), and finally, add yourifcondition at the very end to filter out the zeros. - Parallel Iteration: Do not use manual index tracking like
employee_names[i]! As taught in the tutorial, calling Python-level indices repeatedly is incredibly slow, and instead, use Python's built-inzip(employee_names, performance_scores)to efficiently pair the data together at a C-level. - Dictionary Comprehensions: Remember that dictionary comprehensions use curly braces
{}. Your expression at the beginning should declare the key-value pair format using colon (e.g.,name: score), followed by yourforloop gliding through your zipped data and ending with yourifcondition.
Test Cases
Test Case 1 (From Starter Code)
* Inputs:
* activity_matrix = [[1.5, 0, 2.0, 0, 4.5], [0, 0, 3.2, 1.1, 0], [5.0, 0, 0, 0.5, 0]]
* employee_names = ["Alice", "Marcus", "Fiona", "Evan", "Diana"]
* performance_scores = `
* **Expected Output:**
*Active Days: [1.5, 2.0, 4.5, 3.2, 1.1, 5.0, 0.5]*Top Performers: {'Marcus': 92, 'Fiona': 98, 'Diana': 91}`
Test Case 2 (Low Activity, High Performers)
* Inputs:
* activity_matrix = [, [0, 8.5, 0], [1.2, 0, 0]]
* employee_names = ["Charlie", "George", "Hannah"]
* performance_scores = `
* **Expected Output:**
*Active Days: [8.5, 1.2]*Top Performers: {'Charlie': 95, 'Hannah': 100}`
(Explanation: A matrix correctly bypasses all zeros, leaving only two active sessions. George is filtered out of the dictionary because his score was probably below 90.)
Verify Your Solution
Write your solution in the compiler, run it to verify output, then click below to verify.