In order to successfully complete this assignment you must do the required reading, watch the provided videos and complete all instructions. The embedded survey form must be entirely filled out and submitted on or before 11:59pm on Tuesday March 16. Students must come to class the next day prepared to discuss the material covered in this assignment.
The following is a simple implementation of a matrix multiply written in python. Review the code try to understand what it is doing.
%matplotlib inline
import matplotlib.pylab as plt
import numpy as np
import sympy as sp
import random
import time
sp.init_printing(use_unicode=True)
#simple matrix multiply (no numpy)
def multiply(m1,m2):
m = len(m1)
d = len(m2)
n = len(m2[0])
if len(m1[0]) != d:
print("ERROR - inner dimentions not equal")
result = [[0 for i in range(m)] for j in range(n)]
for i in range(0,m):
for j in range(0,n):
for k in range(0,d):
result[i][j] = result[i][j] + m1[i][k] * m2[k][j]
return result
# Random generated 2d lists of lists that can be multiplied
m = 4
d = 10
n = 4
A = [[random.random() for i in range(d)] for j in range(m)]
B = [[random.random() for i in range(n)] for j in range(d)]
#Compute matrix multiply using your function
start = time.time()
simple_answer = multiply(A, B)
simple_time = time.time()-start
print('simple_answer =',simple_time,'seconds')
simple_answer = 8.296966552734375e-05 seconds
Lets compare this to the numpy result:
#Compare to numpy result
start = time.time()
np_answer = np.matrix(A)*np.matrix(B)
np_time = time.time()-start
print('np_answer =',np_time,'seconds')
np_answer = 0.07307672500610352 seconds
For this example, numpy result are most likely slower than the simple result. Think about why this might be. We will discuss this later.
✅ DO THIS: See if you can write a loop to do a scaling study for the above code. Loop over the value of $n$ such that $n$ is 4, 16, 32, 64, 128 and 256. For each iteration generate two random matrices (as above) with $m = d = n$. Then time the matrix multiply for the provided function and again for the numpy function. Graph the results as size of $n$ vs time.
# Put your code here
✅ DO THIS: Explore the Internet for ways to speed up Python (There are a lot of them). Save some of your search results in the cell below and come to class prepaired to discuss what you found.
Put your search results here.
Here is an example for running parallel python using the multiprocessing
library.
import multiprocessing
num_procs = multiprocessing.cpu_count()
print('You have', num_procs, 'processors')
def worker(procnum, return_dict):
'''worker function'''
print(str(procnum) + ' represent!')
return_dict[procnum] = procnum
if __name__ == '__main__':
manager = multiprocessing.Manager()
return_dict = manager.dict()
jobs = []
for i in range(5):
p = multiprocessing.Process(target=worker, args=(i,return_dict))
jobs.append(p)
p.start()
for proc in jobs:
proc.join()
print(return_dict.values())
You have 256 processors 0 represent! 1 represent! 2 represent!3 represent! 4 represent! [0, 1, 2, 3, 4]
The following is the instructor's attempt at using multiprocessing to do matrix multiply. First lets start with a serial method.
%matplotlib inline
import matplotlib.pylab as plt
import numpy as np
import sympy as sp
import random
import time
sp.init_printing(use_unicode=True)
#simple matrix multiply (no numpy)
def multiply(m1,m2):
m = len(m1)
d = len(m2)
n = len(m2[0])
if len(m1[0]) != d:
print("ERROR - inner dimentions not equal")
result = [[0 for i in range(m)] for j in range(n)]
for i in range(0,m):
for j in range(0,n):
for k in range(0,d):
result[i][j] = result[i][j] + m1[i][k] * m2[k][j]
return result
# Random generated 2d lists of lists that can be multiplied
m = 4
d = 10
n = 4
A = [[random.random() for i in range(d)] for j in range(m)]
B = [[random.random() for i in range(n)] for j in range(d)]
#Compute matrix multiply using your function
start = time.time()
simple_answer = multiply(A, B)
simple_time = time.time()-start
print('simple_answer =',simple_time,'seconds')
simple_answer = 8.320808410644531e-05 seconds
Lets compare this to the numpy result:
#Compare to numpy result
start = time.time()
np_answer = np.matrix(A)*np.matrix(B)
np_time = time.time()-start
print('np_answer =',np_time,'seconds')
np_answer = 0.00034308433532714844 seconds
#Compare to numpy result
A_ = np.matrix(A)
B_ = np.matrix(B)
start = time.time()
np_answer = A_*B_
np_time = time.time()-start
print('np_answer =',np_time,'seconds')
np_answer = 6.008148193359375e-05 seconds
np.allclose(simple_answer,np_answer)
True
On some systems the numpy result may be slower than the simple result. Think about why this might be. We will discuss this later.
#Attempt at a parallel multiply
def parallel_multiply(m1,m2):
m = len(m1)
d = len(m2)
n = len(m2[0])
def dot_worker(row,col):
"""thread worker function"""
#print('Worker:', i,j)
temp = 0
for k in range(len(m2)):
temp = temp + m1[row][k] * m2[k][col]
return_dict[(row,col)] = temp
return
jobs = []
manager = multiprocessing.Manager()
return_dict = manager.dict()
for i in range(m):
for j in range(n):
#p = dot_worker(i,j)
p = multiprocessing.Process(target=dot_worker, args=(i,j,))
jobs.append(p)
p.start()
for proc in jobs:
proc.join()
print('Used',len(jobs),'threads in calculation.')
C = return_dict.values()
C = np.matrix(return_dict.values())
C = C.reshape((m,n))
return C
#Compute matrix multiply using your function
start = time.time()
parallel_answer = parallel_multiply(A, B)
parallel_time = time.time()-start
print('parallel_answer=',parallel_time,'seconds')
Used 16 threads in calculation. parallel_answer= 0.09735774993896484 seconds
np.allclose(parallel_answer,np_answer)
True
import numpy as np
import matplotlib.pyplot as plt
objects = ('Simple', 'Numpy', 'parallel')
y_pos = np.arange(len(objects))
performance = [simple_time,np_time,parallel_time]
plt.bar(y_pos, performance, align='center', alpha=0.5)
plt.xticks(y_pos, objects)
plt.ylabel('Time (seconds)')
plt.yscale('log')
plt.title('Programming language usage')
Text(0.5, 1.0, 'Programming language usage')
✅ QUESTION: Why do you think the parallel version was so much slower than Python?
Put your answer to the above question here.
✅ DO THIS: Read the following blog post and answer the questions: https://wiki.python.org/moin/GlobalInterpreterLock
✅ QUESTION: Why was the GIL introduced to the Python programming language?
Put your answer to the above question here.
✅ QUESTION: How does the GIL help avoid race conditions?
Put your answer to the above question here.
✅ QUESTION: How does the GIL help avoid deadlock?
Put your answer to the above question here.
✅ QUESTION: Why is the GIL problematic to parallel libraries like the "thread" and "multiprocessing" libraries?
Put your answer to the above question here.
Fortunately there are ways to get around the GIL. In fact, Python has libraries that do shared memory parallelization, shared network parallelization and GPU acceleration. Do some research and answer the following questions:
✅ QUESTION: Some of numpy
library can run in parallel. How does numpy
get around the GIL?
Put your answer to the above question here.
✅ QUESTION: The numba
library can also run in parallel. How does numba
get around the GIL?
Put your answer to the above question here.
✅ QUESTION: What python library can be used to program GPUs?
Put your answer to the above question here.
✅ QUESTION: What python library can be used to run shared network parallelization such as the Message Passing Interface (MPI)?
Put your answer to the above question here.
✅ QUESTION: There seem to be a lot of solutions for running Python in parallel. Provide an argument(s) as to why you would bother with an "older" language such as C/C++ or Fortran?
Put your answer to the above question here.
Please fill out the form that appears when you run the code below. You must completely fill this out in order to receive credits for the assignment!
If you have trouble with the embedded form, please make sure you log on with your MSU google account at googleapps.msu.edu and then click on the direct link above.
Put your answer to the above question here
✅ QUESTION: Summarize what you did in this assignment.
Put your answer to the above question here
✅ QUESTION: What questions do you have, if any, about any of the topics discussed in this assignment after working through the jupyter notebook?
Put your answer to the above question here
✅ QUESTION: How well do you feel this assignment helped you to achieve a better understanding of the above mentioned topic(s)?
Put your answer to the above question here
✅ QUESTION: What was the most challenging part of this assignment for you?
Put your answer to the above question here
✅ QUESTION: What was the least challenging part of this assignment for you?
Put your answer to the above question here
✅ QUESTION: What kind of additional questions or support, if any, do you feel you need to have a better understanding of the content in this assignment?
Put your answer to the above question here
✅ QUESTION: Do you have any further questions or comments about this material, or anything else that's going on in class?
Put your answer to the above question here
✅ QUESTION: Approximately how long did this pre-class assignment take?
Put your answer to the above question here
from IPython.display import HTML
HTML(
"""
<iframe
src="https://cmse.msu.edu/cmse401-pc-survey"
width="100%"
height="500px"
frameborder="0"
marginheight="0"
marginwidth="0">
Loading...
</iframe>
"""
)
To get credit for this assignment you must fill out and submit the above survey from on or before the assignment due date.
Written by Dr. Dirk Colbry, Michigan State University
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.