The problem statement demands to predict the estimated time of arrival based on five parameters. The inputs to the algorithm, i.e Route_ID, Direction, Bus_Name, Time and Stop_Code are all static. This means it does not include the anomalies and the problem statement boils down to building a simple model or a time-table. In other words, answering the simple question “At what time the bus is supposed to come?” based on existing statistics? And doesn’t really “Predict”?
Is my understanding right?
PS. I think including the ‘current_location’ will have a big role in making a dynamic “prediction”. Also, modeling the “type of anomaly” can make the prediction better and that would be entirely a different problem statement.