4.3. Testing

This section shows how to compare between 2 Pandas DataFrame or between 2 Pandas Series

4.3.1. assert_frame equal: Test Whether Two DataFrames are Similar

If you want to test whether two DataFrames are similar or how much they are different from each other, try pandas.testing.assert_frame_equal.

from pandas.testing import assert_frame_equal
import pandas as pd 


df1 = pd.DataFrame({'coll': [1,2,3], 'col2': [4,5,6]})
df2 = pd.DataFrame({'coll': [1,3,4], 'col2': [4,5,6]})
assert_frame_equal(df1, df2)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/tmp/ipykernel_56620/3851828215.py in <module>
      5 df1 = pd.DataFrame({'coll': [1,2,3], 'col2': [4,5,6]})
      6 df2 = pd.DataFrame({'coll': [1,3,4], 'col2': [4,5,6]})
----> 7 assert_frame_equal(df1, df2)

    [... skipping hidden 2 frame]

~/book/venv/lib/python3.8/site-packages/pandas/_libs/testing.pyx in pandas._libs.testing.assert_almost_equal()

~/book/venv/lib/python3.8/site-packages/pandas/_libs/testing.pyx in pandas._libs.testing.assert_almost_equal()

~/book/venv/lib/python3.8/site-packages/pandas/_testing/asserters.py in raise_assert_detail(obj, message, left, right, diff, index_values)
    663         msg += f"\n[diff]: {diff}"
    664 
--> 665     raise AssertionError(msg)
    666 
    667 

AssertionError: DataFrame.iloc[:, 0] (column name="coll") are different

DataFrame.iloc[:, 0] (column name="coll") values are different (66.66667 %)
[index]: [0, 1, 2]
[left]:  [1, 2, 3]
[right]: [1, 3, 4]