Try   HackMD

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More β†’

High-Performance Data Analytics with Python - Day 2

High-Performance Data Analytics with Python β€” Schedule: https://hackmd.io/@yonglei/python-hpda-2025-schedule

Schedule

Time Contents Instructor(s)
09:05-10:20 Parallel computing QL
10:20-10:40 Break
10:40-11:55 Benchmarking, profiling and optimizing AM
11:55-12:00 Q/A & Reflections


You can ask questions about the workshop content at the bottom of this page. We use the Zoom chat only for reporting Zoom problems and such.

Questions, answers and information

4. Parallel computing

  • Can you please define process, task and threads?

    • Task: generic term for a piece of code, typically a function with a unique input. Task can be executed in the computer using processes or threads.
    • Process: every program often starts off as a single process. When we use multiprocessing, the main process forks into various separate processes each with their own copy of the whole memory. Provided memory is not a constraint, multiprocessing is easier to parallelize.
    • Thread: threads are more lightweight, and they share the memory.
  • Two interesting blogpost about multiprocessing and mpi4py

    • Parallel programming in Python with multiprocessing: part 1 and part 2
    • Parallel programming in Python with mpi4py: part 1 and part 2
  • Hi, I got error with the second example

    ​​​​TypeError: ThreadPoolExecutor.__init__() got an unexpected keyword argument 'max_worker'
    
    • I also got this error! I am using VSCode with an Interactive mode window
    • Tatianais suggestion solved the problem
    • Note for exercises II I/O-bound vs CPU-bound, max_worker should be max_workers
    • Thanks, @Tatianais,
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More β†’
  • In the exercize about CPU-bound the function returns the sum, how do I gat the output?

    • the exerise is about comparing the time spent using multi-threading vs multi-processing
  • General question about locks: is it required to set lock if accessing the same variable for read-only or?

    • If it is guaranteed that operation is read-only, then locks are not required. Locks are used in multithreading only to avoid race condition, which result usually when writing some data. When locks are acquired, the execution is similar to sequential code.
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More β†’
      Thanx!
    • An aside, upto Python 3.13 we always had a Global Interpreter Lock (or GIL), which ensured locks when needed, and made it easy to write multithreading applications without our own locks. However now we can run Python without GIL.

Break until XX:40

5. Benchmarking, profiling and optimizing

Exercise 1

  • For exercise 1, I got the results below if I use time. what is the difference between cpu time and wall time

    ​​​​# output for square_sum example
    ​​​​CPU times: total: 0 ns
    ​​​​Wall time: 0 ns
    ​​​​np.int64(332833500)
    
    ​​​​# output for urlopen example
    ​​​​CPU times: total: 250 ms
    ​​​​Wall time: 736 ms
    ​​​​<http.client.HTTPResponse at 0x2e2e2212230>
    
    • We will explain it shortly, but 0 ns is unusual. Did you get an error? Was it for question 1 or 3? You seem to have a super fast computer. Increase the size of the array, then. Something like, a = np.arange(1_000_000)
    • maybe the computation is too fast
  • Is it possible to use %timeit in vscode in interactive window or other ways?

    • It should be. If you are using a Jupyter notebook inside VSCode. Otherwise, you can run ipython in the terminal.

Exercise 2

  • 4.62 s Β± 79.3 ms per loop for walk(10_000_000) and 6.19 s Β± 91.5 ms per loop for walk_vec(10_000_000)
    • similar results as 457 ms Β± 21.8 ms per loop for walk(1_000_000) and 625 ms Β± 18.3 ms per loop for walk_vec(1_000_000)
      • Good that you tested for 2 different input sizes. Based on those 2, we see the unmodified walk_vec is surprisingly (
        Image Not Showing Possible Reasons
        • The image file may be corrupted
        • The server hosting the image is unavailable
        • The image path is incorrect
        • The image format is not supported
        Learn More β†’
        ) slower.

Reflections and quick feedback:

One thing that you liked or found useful for your projects?

  • Very interesting and usefull class today! I appreciated the time to test things and the very clear exercised in the second part

One thing that was confusing/suboptimal, or something we should do to improve the learning experience?

  • more description for the parallel computing episode
  • The part on parallel computing was difficult to follow, probably because many terms and concepts were rather new to me or not well understood. A more detailed introduction would probably help those how are not very familiar with processes/threads etc like me. The exercises could also be improved I think, sometimes it is not clear what one should do and there are technical details (like %%time has to be the first line in a cell) that slow down the execution.

Always ask questions at the very bottom of this document, right above this.