###### tags: `python` `jupyter notebook` `replace` `re.sub` # Python_string_manipulation ### Replace strings --- #### Replace strings with 1 or multiple patterns. Note `re.sub` ```python! import re # Replace string ' (what ever)' string='Bolivia (Plurinational State of)' pattern1=' \\(.*\\)' re.sub(pattern1,'',string) # working # Remove any number that follow string string2='Bolivia (Plurinational State of)20' pattern2='[0-9]+' re.sub(pattern2,'',string2) # working # Replace string with 2 patterns patterns= pattern1 + '|' + pattern2 re.sub(patterns,'',string2) # working ``` --- #### Replace strings in a column with 1 pattern ```python! import pandas as pd import numpy as np # Import data energy= pd.read_excel('D:/Now/workshops/20191013_coursera_Introduction-to-Data-Science-in-Python/course1_downloads/Energy Indicators.xls' ,skiprows=17 ,names=column_names ,nrows=227) # Replace ... with NaN energy = energy.replace('...', np.nan) ``` --- #### Replace strings in a column with multiple patterns * syntax: `DataFrame['column'].str.replace('pattern1|pattern2','replaced-string')` ```python! # Replace strings ## regular expression is different in Python from bash (* in bash = .* in python) ## delete () and text within (e.g. ) ## delete any number at any length that follows country names energy['Country']= energy['Country'].str.replace(' \\(.*\\)|[0-9]+','') ```