Why is my query taking so long to complete?

Hint: It's probably too wide-ranging.

There are a number of reasons why the query you are executing on Snowflake for the Flight Info Direct product may take a long time. Here are some common reasons:

Not using conditional statements when querying the databases

OAG provides customers with a variety of data sets, and many of those data sets can be very large with billions of of rows of data. Because of this, we recommend customers to make use of conditional statements as much as possible, to reduce the amount of results returned to only what you required.

For example, the schedules data set gets updated with a complete forward-looking schedules snapshot (indicated by the FILE_DATE field) every morning UTC time. Every snapshot, which is indicated by the FILE_DATE parameter, contains about 100 million rows of flight schedules data. Therefore, if customers query the database without including a FILE_DATE condition, Snowflake is parsing through several billion rows at once which will take a longer period of time to execute.

If you would like to compare between different snapshots, our recommendation is to filter down the number of snapshots to exactly what you need for your query, or make use of other conditional statements to reduce the amount of data Snowflake has to process.

Using a completely wide-open SELECT statement (SELECT * FROM TABLE) on the schedules or connections database will take hours to complete due to the sheer amount of data that is being queried. Because of this, we do not recommend customers to execute such a query. Use conditional statements whenever possible.

Sorting the results with any of the unique identifiers in the data sets

The unique identifiers in our dataset: oag_schedule_fingerprint, connection_id, scheduled_flight_leg_pk and other similar ID fields are used to identify unique rows based on differing criteria in each of the data sets. As the values for those fields are a long string of random characters and digits, it takes a lot of processing for Snowflake to sort this field. As much as possible, we recommend customers to not use SORT or similar CTEs on the field.