是数据清洗的重要过程,可以按索引对齐进行运算,如果没对齐的位置则补NaN,最后也可以填充NaN
Series的对齐运算
1. Series 按行、索引对齐
示例代码:
s1 = pd.Series(range(10, 20), index = range(10))
s2 = pd.Series(range(20, 25), index = range(5))print('s1: ' )print(s1)print('')
print('s2: ')print(s2)
运行结果:
s1:
0 10
1 11
2 12
3 13
4 14
5 15
6 16
7 17
8 18
9 19dtype: int64
0 20
1 21
2 22
3 23
4 24dtype: int64
2. Series的对齐运算
示例代码:
# Series 对齐运算s1 + s2
运行结果:
0 30.0
1 32.0
2 34.0
3 36.0
4 38.0
5 NaN6 NaN7 NaN8 NaN9 NaN
dtype: float64
DataFrame的对齐运算
1. DataFrame按行、列索引对齐
示例代码:
df1 = pd.DataFrame(np.ones((2,2)), columns = ['a', 'b'])
df2 = pd.DataFrame(np.ones((3,3)), columns = ['a', 'b', 'c'])print('df1: ')print(df1)print('')
print('df2: ')print(df2)
运行结果:
df1:
a b
0 1.0 1.0
1 1.0 1.0df2:
a b c
0 1.0 1.0 1.0
1 1.0 1.0 1.0
2 1.0 1.0 1.0
2. DataFrame的对齐运算
示例代码:
# DataFrame对齐操作df1 + df2
运行结果:
a b c
0 2.0 2.0 NaN1 2.0 2.0 NaN2 NaN NaN NaN
填充未对齐的数据进行运算
1. fill_value
使用add,sub,div,mul的同时,通过fill_value指定填充值,未对齐的数据将和填充值做运算
示例代码:
print(s1)print(s2)
s1.add(s2, fill_value = -1)print(df1)print(df2)
df1.sub(df2, fill_value = 2.)
运行结果:
# print(s1)0 10
1 11
2 12
3 13
4 14
5 15
6 16
7 17
8 18
9 19dtype: int64# print(s2)0 20
1 21
2 22
3 23
4 24dtype: int64# s1.add(s2, fill_value = -1)0 30.0
1 32.0
2 34.0
3 36.0
4 38.0
5 14.0
6 15.0
7 16.0
8 17.0
9 18.0dtype: float64# print(df1) a b
0 1.0 1.0
1 1.0 1.0# print(df2) a b c
0 1.0 1.0 1.0
1 1.0 1.0 1.0
2 1.0 1.0 1.0# df1.sub(df2, fill_value = 2.) a b c
0 0.0 0.0 1.0
1 0.0 0.0 1.0
2 1.0 1.0 1.0
算术方法表: