Essay - Published: 2019.09.02 | async | await | python |
DISCLOSURE: If you buy through affiliate links, I may earn a small commission. (disclosures)
I'm new to Python and coming from a C# background. Building a program that downloads a lot of images, so would be useful to be able to async each download / write in a task and then parallelize those downloads / io writes. How can I parallelize these fetches and writes using the built-in asyncio and the external aiohttp libraries in a similar fashion as C#'s Task.AwaitAll(MYTASKS)?
Here's an example. Obviously there's a bajillion ways to do this, but this example shows how to start your script such that asyncio can handle it and how to create an array of tasks and await them.
Note: This is written against the latest (as of writing) Python version, 3.7.3. Other versions may require slightly different code.
The code:
import asyncio
import random
import time
async def main():
async_tasks = []
number_of_tasks = 10
wait_times_seconds = []
for i in range(0, number_of_tasks):
wait_times_seconds.append(i)
for wait_seconds in wait_times_seconds:
async_tasks.append(wait_for_time_async(wait_seconds))
start_time = time.time()
await asyncio.gather(*async_tasks)
end_time = time.time()
print("total operation time = ", (end_time - start_time))
async def wait_for_time_async(seconds: int) -> None:
await asyncio.sleep(seconds)
asyncio.run(main())
So what does this code do?
asyncio.run(MYFUNCTION) (line 24) to give asyncio access to handle async operationswait_times_seconds (lines 10 and 11)async_tasks on lines 13 and 14 (you can see the awaiting code via asyncio.sleep(SECONDS) on lines 21 and 22)asyncio.gather(*ASYNC_TASKS) on line 17The output of this on my machine is:
$ python async_example.py
total operation time = 9.002383708953857
So we ran in parallel via async, so this method works to run in parallel via async.
To sanity check, we know that we created 10 tasks from 0 - 9 and then tried to await each of them. If they waited sequentially, we would've waited 0 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 = 45 seconds, but based on the output we only waited 9 (the largest wait_time in our array) and thus this likely did run in parallel via async.
I regularly post about tech topics I run into. You can get a periodic email containing updates on new posts and things I've built by subscribing here.
The best way to support my work is to like / comment / share for the algorithm and subscribe for future updates.