Frequently Used Python Codes

# Frequently Used Python Codes Documenting some of the frequently used Python codes for the usage of office 420, MFM. All codes are tested on python 3.8.x [TOC] ## File processing ### find current directory ```python= if __name__ == "__main__": path = os.getcwd() ``` ### create / delete folder ```python= work_path = os.path.join(path,'new_folder_name') if os.path.exists(work_path): shutil.rmtree(work_path) # if the path exists, delete the folder os.makedirs(work_path) ``` ## Text processing ### Search a sub-string in a string ```python= ''' is_in(full_str, sub_str) Input: - full_str: (type:string) the string where the sub-string might located in - sub_sub: (type:string) the targeted sub-string Output: - True: sub_str is in full_str ''' def is_in(full_str, sub_str): try: full_str.index(sub_str) return True except ValueError: ``` ### Separate a string into a list using member function `split()` ```python= # Syntax for using split() #string.split('seperator', maxsplit) #Examples str = 'this is string example...fxxking raw!!' list1 = str.split() # default: take space as delimiter print(list1) # ['this', 'is', 'string', 'example....fxxking', ‘raw!!!] list2 = str.split('i', 1) # take 'w' as delimiter print(list2) # ['th', 's ', 's str', 'ng example...fxxk', 'ng raw!!'] list3 = str.split('i', 1) # take the first 'i' as delimiter print(list3) # ['th', 's is string example...fxxking raw!!'] ``` It is useful when well-structured filenames are used, e.g., `20-100_l4_h5.csv` for the syntax `<power>-<speed>_<layer>_<clip_width>.csv`, then the following code can be used for locating/filtering out the file: ```python= ## work_path is the place curing csv # obtain the file list under work_path file_list = os.listdir(work_path) # filtering the .csv csv_list = [f for f in file_list if f.split('.')[-1]=='csv'] # filtering the layer == 4 # notice here f.split('_') will return # ['20-100', 'l4', 'h5.csv'] layer = 4 csv_list = [f for f in file_list if (f.split('.')[-1]=='csv') and (f.split('_')[1]=='h%i'%layer)] ``` ### Combining a list into a string using `join()` ```python= list = ['This', 'is', 'an', 'example', 'of', 'a', 'list'] string = "'%s'"%' '.join('%s'%word for word in list1) #Notice the <space> in ' ' before .join() print(string) # 'This is an example of a list' string = "%s"%'_'.join('%s'%word for word in list1) print(string) # This_is_an_example_of_a_list ``` ### Sorting a list using `sort()` ```python list = [23, 1, 4, 5, 9] list.sort() ``` ## Data processing (using `Numpy` and `Scipy`) ### Output `dict` as `.npy` files This method is very useful to save metadata encountered during works into readable/sharable file. Format `.npy` from `numpy` is more flexible and extendable than `.json` in many aspects (e.g., allowed data type), though it is less general and universal than `.json`. In that sense, we can make use of `.npy` in our own works (inward), while sharing `.json` for outward cooperation. **Save** ```python= dict = {'type': 'FeNi', 'fraction': np.linspace(0,1,100), 'c': 3} np.save('dict.npy', dict) ``` **Read** ```python= load_dict = np.load('dict.npy', allow_pickle=True).item() print(load_dict) ``` remember here the option `allow_pickle` should be `True` to allow necessary parsing. If `False` or `Default`(not set), there would be error messege. ### fitting with Spline ```python= # x: numpy:ndarray # y: numpy:ndarray # # create spline spline = InterpolatedUnivariateSpline(x, y, k=1) # k (1 <= k <= 5) is the degree of smoothing # afterwards, spline is a function # plottings plt.plot(x0, spline(x0)) ``` for detailed example, see "Compacted Example > Fitting a given set of data using Spline" ### fitting the array as the piecewise linear-interpolated function ```python= from scipy.interpolate import interp1d E_intp = interp1d(T, E, kind='linear') ``` ## Table processing (using `Pandas`) ### load all sorts of files as `DataFrame` ```python= import pandas as pd # load Pandas as alias 'pd' pd.read_csv(filename) # load from a .csv pd.read_table(filename, sep=<delimiter>) # load from a delimited text file. e.g., # sep=',' for ',' as delimiter # sep='\t' for tab as delimiter pd.read_excel(filename) # load from a .excel pd.read_sql(query, connection_object) # load from a SQL database pd.read_json(json_string) # load from a string in JSON format pd.read_html(url) # Parse URLs, strings or HTML files and extract tables from them pd.read_clipboard() # Get the content from your clipboard and pass it to read_table() pd.DataFrame(dict) # load from a dictionary object, Key is the column name, Value is the data ``` ### saving `DataFrame` as ```python= df.to_csv('output.csv', index=False) # Saves the DataFrame as a .csv file df.to_clipboard(sep=',', index=False) # Copies the DataFrame to the system clipboard, seperated by commas (,) df.to_excel("output.xlsx") # Saves the DataFrame as an excel file df.to_latex(index=False) # Converts the DataFrame to a LaTeX input ``` ### data selection / slice ```python= ''' for DataFrame ''' df[col] # According to the column name, and return the column as a Series df[[col1, col2]] # Return multiple columns as DataFrame df.iloc[0,:] # return the first row df.iloc[0,0] # Returns the first element of the first column df.values[:,:-1] # Returns all data for all columns except the last column df.query('[1, 2] not in c') # Return other datasets that do not contain 1, 2 in column c ''' For Series ''' s.iloc[0] # select data by location s.loc['index_one'] # Select data by index ``` ### Shift column by index ```python= df.shift(periods=-1) # Shifts the whole DataFrame up by one period and by default last row takes the value NaN df.shift(periods=3, fill_value=0) # Shifts the whole dataFrame by the number of periods and fills the given value in place of the shift df['col2'] = df['col1'].shift(periods = 1) # Creates a new column called col2 and takes the values from col1 after shifting down one value and by default, NaN is assigned to the first cell in the newly created column ``` ### Append a `DataFrame` to the end of another ```python= df = df.append(dfn) # both df and dfn are DataFrame # similar to the list.append, df can be originally an empty DataFrame ``` It is worth noting that this method is also possible for iteratively collect data using `dict`. In that sense, `ignore_index` option should be on ```python= df = pd.DataFrame() for T in Tspace: X = func(T) df = df.append({'T': T, 'X': Cs}, ignore_index=True) ``` ### Rename `column` titles by `dict` before: ![](https://i.imgur.com/yBnqYu5.png) ```python= df=df.rename(columns={'a':'one', 'c':'three'}) ``` after: ![](https://i.imgur.com/D2a1Xw3.png) ### Merge two `DataFrame` by matching column values df1: ![](https://i.imgur.com/UfST3AC.png) df2: ![](https://i.imgur.com/t7bnRh7.png) ```python= df = pd.merge(df1,df2, on=['Power', 'Speed']) ``` after: ![](https://i.imgur.com/FOrgr2a.png) ## General plot Setups (using `matplotlib`) Possible to use together with the `plot_essentials` module from `MEMER` ### make the logarithmic scale ```python= ax.set_xscale('log') ax.set_yscale('log') ``` ### make the scientific notation ```python= ## Notice here the 'scilimits' give the range that would not ## be denoted in the scientific way, e.g., (-1, 2) means only ## the data beyond 10^2 and below 10^-1 will be denoted. ax1.ticklabel_format(style='sci', scilimits=(-1,2), axis='y') ``` # Compacted Examples ## Numerical applications ### make the backward finite differential $$ f'(a)\approx\frac{f(a)-f(a-h)}{h} $$ where shifted columns 'a' and 'f' are required ![](https://i.imgur.com/H9M55sZ.png) Now shifting the column 'a' and 'f' to create new columns 'a_n' and 'f_n'. ```python= import pandas as pd df = pd.read_csv('table.csv') # read a .csv file named 'table.csv' df['a_n'] = df['a'].shift(-1) df['f_n'] = df['f'].shift(-1) df.head() ``` ![](https://i.imgur.com/H3Jz0CS.png) ***Notice*** after the shift the last row will fill with 'NA' and should be removed using ```python= df = df.dropna(axis=0,how='any') ``` make the finite differential using member function `apply()` ```python= df['d_f']=df[['a','f','a_n','f_n']].apply(lambda x: (x['f_n']-x['f'])/(x['a_n']-x['a']),axis=1) df.head() ``` ![](https://i.imgur.com/orpOjgS.png) ### Finding intersection between two lines This code uses numpy.linalg.solve() to find where two lines intersect with each other using only their endpoints coordinates ```python= import numpy as np # Give the endpoints coordinates # Line 1 passing through points p1 (x1,y1) and p2 (x2,y2) p1 = [0, 0] p2 = [1, 1] # Line 2 passing through points p3 (x3,y3) and p4 (x4,y4) p3 = [0, 1] p4 = [1, 0] # Line 1 dy, dx and determinant a11 = (p1[1] - p2[1]) a12 = (p2[0] - p1[0]) b1 = (p1[0]*p2[1] - p2[0]*p1[1]) # Line 2 dy, dx and determinant a21 = (p3[1] - p4[1]) a22 = (p4[0] - p3[0]) b2 = (p3[0]*p4[1] - p4[0]*p3[1]) # Construction of the linear system # coefficient matrix A = np.array([[a11, a12], [a21, a22]]) # right hand side vector b = -np.array([b1, b2]) # solve try: intersection_point = np.linalg.solve(A,b) print('Intersection point detected at:', intersection_point) except np.linalg.LinAlgError: print('No single intersection point detected') ``` The example above gives the output: ```python= Intersection point detected at: [0.5 0.5] ``` The answer can also be confirmed graphically by making a simple plot of lines 1 and 2: ![](https://i.imgur.com/jNROOl7.png) ### Fitting a given set of data using Spline This code uses InterpolatedUnivariateSpline from scipy to fit a given set of data points. Either the whole range of the data or some part(s) of the data can be fitted. ```python= import pandas as pd import numpy as np import matplotlib.pyplot as plt from scipy.interpolate import InterpolatedUnivariateSpline # your data df = pd.read_csv("quadratic_data.csv") # read a .csv file named 'quadratic_data.csv' # x-space for the spline x0 = np.linspace(0, 4) # create spline s2 = InterpolatedUnivariateSpline(df['x'], df['y'], k=1) # k (1 <= k <= 5) is the degree of smoothing # plottings plt.scatter(df['x'], df['y']) plt.plot(x0, s2(x0)) ``` Example: The scattered points to be fitted; ![](https://i.imgur.com/oeXjdJT.png) Fitted graph for k =1: ![](https://i.imgur.com/cbwAkm5.png) Fitted graph for k =2: ![](https://i.imgur.com/01iOiQ1.png) ### Applying a function on a `DataFrame` by iterating the columns with an exception ```python= labels = ['Time','avg_T', 'avg_vm', 'avg_peeq'] df = pd.read_csv('table.csv') # Exceptions using `if` and `continue` for l in labels: if l == 'Time' : continue #rest of the loop code, for example df[l] = df.apply(lambda x: x[l] / x['avg(c)'],axis=1) ``` ### Grouping and calculation of average values of `DataFrame` Applying this takes a `DataFrame`, groups values of all columns and groups them along the keys of one specific column. ``` python= import pandas as pd import numpy as np df = pd.DataFrame({'column1':['key1','key1','key2','key2'], 'column2':[1,6,23,2], 'column3':['value11','value11','value22','value22'], 'column4':['value44','value44','value55','value55']}) display(df) df1 = pd.DataFrame() df2 = pd.DataFrame() df1['grouped1'] = df.groupby('column1')['column2'].apply(list) df1=df1.reset_index() display(df1) df2['grouped2'] = df.groupby('column1')['column4'].apply(list) display(df2) ``` In a second step the average of values sorted to each key is calculated ```python= df1['avg1']=df1.apply(lambda x: np.mean(x['grouped1']), axis=1) df1['avg2']=df1['grouped1'].apply(lambda x: np.mean(x)) display(df1) ``` ## HPC application ### Generating a quick `batch.sh` for running multiple simulations in series from the sub-folders This is very useful for $\mathrm{MuMax^3}$-ish simulations, which has relatively less time consumption for each sim but larger in amount. Due to the limited number of GPU, those sims hugely rely on series calc. The following code is used to quickly write a `batch.sh` for operating all sims from the sub-folders in series. This can be solely achieved by `bash`, nontheless the provided code can be integrated into other `python` project, say, a batched inputfile generator or so. ```python= import os, io # find current directory if __name__ == "__main__": path = os.getcwd() input_name = 'input.mx3' command = 'mumax3' folders = os.listdir(path) work_path = path # can be changed to the folder you want to work with main_str = str() for f in folders: main_str += 'cd %s \n'%(os.path.join(work_path, f)) main_str += '%s %s \n'%(command, input_name) main_str += 'cd %s \n'%(work_path) main_str += '\n' print(main_str) bash_path = os.path.join(work_path, 'batch.sh') if os.path.exists(bash_path): os.remove(bash_path) outf=io.open(bash_path,'w',newline='\n') outf.write(main_str) outf.close() ``` The output then contains: ![](https://hackmd.io/_uploads/BJZMmDOJT.png)

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.