ITMS data set query

I think there is a big mismatch in the static data and the live data available for the ITMS bus challenge.
The problem is this!
In the static data, there is a shape file for each unique route.
That shape does not match with the the trip ID of the live data.

So, how shall I use the live data when it is not reliable?


Hi Nihesh!
This is a known problem with the dataset. The live bus location of a few trips will often not correspond to it’s static route. This is because the same bus changes it’s trip after completing a previous one but the change is not updated by the driver of the bus.
To detect such occurrences, one can look at the variation in STOP_ID. For a bus with “good” data, the STOP_ID keeps changing as time progresses. Also, the CURRENT_STOP_SEQUENCE field linearly increases with time. You can observe such trends and consider a trip for your evaluation.

This is also shown in the ITMS Jupyter notebook.

Hi Nihesh,

Could you provide details of an instance of the problem you are referring to?

Two know issues with regard to static and live data are:

  1. The stop_id field in live data refers to stop_code field in static file stops.txt.
  2. As mentioned in the earlier comment, even when a bus is in transit, if the trip_id is not configured correctly, the bus location gets updated, but the stop_sequences are not updated correctly.