4.3. Testing¶
This section shows how to compare between 2 Pandas DataFrame or between 2 Pandas Series
4.3.1. assert_frame equal: Test Whether Two DataFrames are Similar¶
If you want to test whether two DataFrames are similar or how much they are different from each other, try pandas.testing.assert_frame_equal
.
from pandas.testing import assert_frame_equal
import pandas as pd
df1 = pd.DataFrame({'coll': [1,2,3], 'col2': [4,5,6]})
df2 = pd.DataFrame({'coll': [1,3,4], 'col2': [4,5,6]})
assert_frame_equal(df1, df2)
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
/tmp/ipykernel_56620/3851828215.py in <module>
5 df1 = pd.DataFrame({'coll': [1,2,3], 'col2': [4,5,6]})
6 df2 = pd.DataFrame({'coll': [1,3,4], 'col2': [4,5,6]})
----> 7 assert_frame_equal(df1, df2)
[... skipping hidden 2 frame]
~/book/venv/lib/python3.8/site-packages/pandas/_libs/testing.pyx in pandas._libs.testing.assert_almost_equal()
~/book/venv/lib/python3.8/site-packages/pandas/_libs/testing.pyx in pandas._libs.testing.assert_almost_equal()
~/book/venv/lib/python3.8/site-packages/pandas/_testing/asserters.py in raise_assert_detail(obj, message, left, right, diff, index_values)
663 msg += f"\n[diff]: {diff}"
664
--> 665 raise AssertionError(msg)
666
667
AssertionError: DataFrame.iloc[:, 0] (column name="coll") are different
DataFrame.iloc[:, 0] (column name="coll") values are different (66.66667 %)
[index]: [0, 1, 2]
[left]: [1, 2, 3]
[right]: [1, 3, 4]