Web scraping

Problem / Goal
Dozens of airlines, hundreds of routes, prices changing everyday, tickets selling off every hour – efficient gathering of flights data is a large and complex task. Skyscanner needed automatic tools to be able to update prices as frequently as required to offer reliable information.
Solution
At first we designed and created software for online gathering of ticket pricing data for the selected airlines. Later, we also created scripts which gathered data from competing websites and compared it to Skyscanner data.
Results
Currently, Skyscanner utilises a few hundred such scripts and can present a very rich and accurate offer of airline tickets. It is now easy for Skyscanner to detect what data is missing and from where (i.e. for which airlines or websites data shortages appeared) and to develop new data-mining applications. Thanks to these improvements, Skyscanner can almost always provide a better offer than the competition.

TEAM

Radosław Szuban
Infrastructure Engineer