Table 1. Distribution of named entities before and after filtering.
Entity
Distribution (%)
Original count
Filtered count
OG
37.8
38,143,982
3,137,400
AF
6.1
6,153,459
522,900
CV
45.8
46,215,902
3,801,400
FD
1.6
1,614,704
132,800
TR
1.0
1,009,190
83,000
TM
7.6
7,669,844
630,800
NW
0.1
100,919
8,331