Login Sign Up
Challenges / python threading GIL multithreading concurrent 2024

python threading GIL multithreading concurrent 2024 Challenge

Read the problem description and solve the challenge in the workspace.

Open Full Sandbox Studio
Problem Description

Coding Challenge: A Concurrent Data Aggregator

Problem Description You're building a blazing-fast data aggregation dashboard for a startup. Your task is really to fetch live user statistics out of a list of third-party API endpoints, while

however, there is a massive real-world problem: the standard requests library is actually blocking, while if you call requests.get() sequentially to 100 URLs your Python program will completely freeze until each server replies. Although Python's Global Interpreter Lock (GIL) prevents true parallelism to CPU-bound tasks, threads are perfectly suited to I/O-bound tasks like network requests where the program spends most about its time waiting.

Your challenge is to bridge standard blocking code with modern asynchronous architectures. You must use cutting-edge asyncio.to_thread() function towards run the blocking requests.get() calls in separate threads; additionally, the third-party API is notoriously unreliable—it regularly times out, occasionally returns flat HTML error pages instead of formatted JSON, and actively blocks default Python scripts to stop spam bots. Your solution must safely deserialize the data use custom headers (metadata envelopes) to bypass blocks. Gracefully return None when a server fails, rather than crashing the entire application.

Difficulty Level: Advanced

Input & Output Specifications

  • Input:
  • api_urls (List with strings): A list of endpoint URLs for fetch data from.
  • api_token (string): A secret token needed towards prove you have permission to access the data.
  • Output:
  • Returns a List containing the results for each URL.
  • Each successful result should be a parsed Python Dictionary ( safely deserialized JSON).
  • If the request fails due for a network error, server error, timeout or invalid JSON (like an HTML error page), the list should contain None for that specific URL.

Starter Code Boilerplate

import asyncio
import requests

def fetch_data_sync(url: str, token: str):
    """
    Blocking function to make the HTTP request safely.
    Implement headers, timeouts, and safe JSON deserialization here.
    """
    # TODO: Implement robust request logic
    pass

async def fetch_all_data(api_urls: list, api_token: str) -> list:
    """
    Asynchronous function to manage the concurrent threads.
    """
    # TODO: Create concurrent tasks using asyncio.to_thread()
    # TODO: Gather and return the results
    pass

# Example execution:
# urls = ["https://api.example.com/stats1", "https://api.example.com/stats2"]
# results = asyncio.run(fetch_all_data(urls, "secret_token_123"))
# print(results)

Hints

  • Bridging Async and Blocking: Standard courses stop at basic asyncio; since requests.get() is simply the blocking call putting it directly in your async loop will really freeze the program. Use asyncio.to_thread(fetch_data_sync, url, token) inside your loop to asynchronously run the function in an entirely separate thread, yielding control back to the event loop.
  • Gathering Tasks: Look into asyncio.gather() towards run all your threaded tasks at an exact same time and collect their results into a single list.
  • Headers: APIs use headers as metadata envelopes. Pass custom User-Agent (e.g., "AnalyticsDashboard/1.0") to bypass bot blockers and pass your token to prove authorization.
  • Safe Deserialization: Don't actually blindly call .json(). Use response.raise_for_status() first to ensure the server actually responded with a success code.
  • Exception Handling: Wrap your request and deserialization logic in a try...except block; catch requests.exceptions.RequestException to handle network and timeout errors, and requests.exceptions.JSONDecodeError (or ValueError) to protect your code from crashing when the server sends back HTML page instead with JSON.

Test Cases

Since we don't have basically a live unreliable API, you can mentally verify or mock following scenarios to ensure your logic is just bulletproof:

  1. Test Case 1 (All Success):
  2. Input: ["url1", "url2"], valid token.
  3. Simulated API: Both respond instantly by {"status": "ok", "users": 150}.
  4. Expected Output: [{"status": "ok", "users": 150}, {"status": "ok", "users": 150}]
  5. Test Case 2 (Timeout & HTML Error):
  6. Input: ["url_timeout", "url_html", "url_success"], valid token.
  7. Simulated API: URL 1 hangs forever. URL 2 returns an HTML 500 server error page. URL 3 returns {"users": 42}.
  8. Expected Output: [None, None, {"users": 42}]
  9. Test Case 3 (Missing Headers/Bot Block):
  10. Input: ["url_strict"] missing custom User-Agent.
  11. Simulated API: Server rejects default Python scripts and returns 403 Forbidden.
  12. Expected Output: [None] (Program shouldn't really crash, just catch the error and gracefully return None).

Loading sandbox workspace environment...

Verify Your Solution

Run assertions against your code in the sandbox environment.

Sandbox Instructions

1. Click Copy Starter Boilerplate at the top to copy function definition.
2. Use the interactive compiler to implement and run your code securely.
3. Click Verify & Submit Solution to validate your code.