Our data is spatiotemporally rich and is one of the only publicly available datasets for PM values based in a developing country. The difference between the PM value distributions of a developing and a developed country is staggering. To corroborate our point and the need for a dataset like ours, we provide a detailed analysis comparing a mobile air monitoring dataset[2] which belongs to the city Hamilton in Ontario, Canada collected during 114 days of air pollution monitoring between November 2005 to November 2016 with our dataset based on a number of parameters. For the purpose of comparison, we divide Delhi and Hamilton into regions by rounding off the latitude and longitude of each record to 3 decimal places and the grouping all the record belonging to each region.

Comparison of Delhi and Canada datasets over different parameters and statistical aspects


The following tables show comparison of Delhi dataset with canada dataset[2] over different parameters. It also presents statistical comparison of PM values recorded by both datasets.

The below table provides comparison of Delhi, Canada dataset[2] and Zurich dataset[1] over different metrices
Metric Delhi Dataset Canada Dataset Zurich dataset
Total number of samples 12542183 46080 Varying for different pollutants (19.9 - 49.7 Million)
Sample with atleast one PM (1.0, 2.5, 10.0) value 12542183 13048 -
Pollutants covered PM1, PM2.5, PM10 CO, NO, NO2, SO2, O3, PM1, PM2.5, PM10 O3, CO, NO2 and UFP
Vehicles used Public bus Commercial van Trams
Monitoring Period 91 days 114 days Varying for different pollutants (2.5 - 4.5 years)
The table below presents estimate of cost of the other sensors(COx, NOx, SOx). We are currently trying augment some of our instruments with additional sensors, but this needs careful planning as budgets in developing countries is scarce. We have started our measurements with PM as accurate PM sensors are available at 2,500 INR.
Pollutants vendor1 (honeywell) vendor2 (alphasense) vendor3 (winsen)
SO2 17,337 INR 8,500 INR 10,600 INR
NO 14,600 INR 8,500 INR -
NO2 10,618 INR 8,500 INR 10,530 INR
CO - 8,500 INR 4,450 INR
CO2 - - 2,910 INR
O3 - - 4,288.28 INR
The below table provides statistical comparison of PM values of Delhi and Canada dataset[2].
Metric Delhi Dataset Canada Dataset
PM1 PM2.5 PM10 PM1 PM2.5 PM10
Min 1 1 1 0 0 0
Max 1730.5 1792 1903 2640 731 291
Mean 120.348 207.9248 226.1106 46.45 15.08 12.15
Std 57.2723 114.3632 123.8647 97.36 12.87 9.02
5th Percentile 45 72 79 4 6 8
95th Percentile 233 435 471 28 32 138
Missing % 0 0 0 72.24 73.62 71.71
Comparison based on total number of data points collected across all regions

The first parameter used for comparison is total number of records collected in every region in the city across the whole duration of the data. The below figures shows this over the cities of Hamilton and Delhi.The color of each circle in the maps represent total number of points collected in every region. As it can be seen from the maps, the circles over most of the regions in Hamilton are of green color which corresponds to 0-20 points over a region. But there is a large variation in the color of circles over Delhi which correspond to more than 500 data points over most regions. This indicates that the number of data points collected across each region in Delhi is much higher than the number of points collected across each region in Hamilton.

Total number of data points across all regions
(a) Delhi PM count
(b) Hamilton PM count
Comparison based on average value of PM2.5 level across all regions

The below graphs show the average of PM2.5 values recorded in every region throughout the dataset over the cities of Hamilton and Delhi. The color of each circle in the maps represent average PM2.5 inevery regions We can observe that the average PM value recorded in most of the regions in Hamilton is in 0-50 range and there are only a few regions that average of greater than 50. Almost none of the regions record an average greater than 100 across the whole duration of data.On the other hand, in Delhi we can see that average PM for most regions is greater than 250 which is much higher than any of regions in Hamilton.

Variance of PM2.5 level across all regions
(a) Delhi
(b) Hamilton
Comparison based on variance of PM2.5 level across all regions

Here, we compare the variance in PM2.5 level in all regions across the whole duration of data. Figure below shows this over the cities of Hamilton and Delhi. The color of each circle in the maps represent PM2.5 variance in every region. We can see that for Hamilton, PM2.5 level varies in a very small region of 0-50 across most of the regions and there are a very few points where the variance is greater than 150. We should also note that this variance is observed in a period spanning 11 years. On the other hand, we see very very high variance in PM2.5 levels recorded across almost all the regions in Delhi and this variance has been observed over just 3 months.

Variance of PM2.5 level across all regions
(a) Delhi
(b) Hamilton
Comparison based on number of hours covered across entire dataset

It is found that in Canada dataset[2], there are some hours where there are no samples at all which is shown using bar plots below. For each hour in a day, we count the total number of minutes which have at least one sample across the whole dataset. We observe that in our dataset we have samples for each minute of each hour. Whereas in Canada dataset[2] there are atleast 9 empty hours and most of them in the night time

Number of minutes covered across each hour
(a) Delhi
(b) Hamilton
Comparison based on frequency of PM values

Below plots show the frequency of PM values in Delhi and Canada datasets respectively. Most of the PM2.5 values lie in the range of 0 to 60 for the Canada dataset while it is in the range 0 to 750 in case of Delhi. Not only the range of PM values is high in our dataset but the frequency of each PM value is also high. Most frequent PM2.5 value in case of Hamilton is around 10 and in case of Delhi is 150. The above analysis also holds in case of PM1 and PM10 values.

Frequency distribution of PM2.5, PM1 & PM10 values across the entire canada dataset[2]
Frequency distribution of PM2.5, PM1 & PM10 values across the entire Delhi dataset

References


  1. Jason Jingshi Li, Boi Faltings, Olga Saukh, David Hasenfratz, and Jan Beutel. Sensing the airwe breathe: The opensense zurich dataset. In proceedings of the Twenty-Sixth AAAI Conferenceon Artificial Intelligence , AAAI’12, page 323–325. AAAI Press, 2012.

  2. Matthew D. Adams and Denis Corr. A mobile air pollution monitoring data set.Data, 4(1),2019