![]() ![]() That being said, let me show you a concrete example. Keep in mind that whenever you set a schedule interval to a DAG, there is always a timetable behind the scene. ![]() First thing first, what is a Timetable?Ī Timetable is a class that defines the schedule interval of your DAG and describes what to do if it is triggered manually or triggered by the scheduler. Now all the basics and concepts are clear, it’s time to talk about the Airflow Timetable. The functions get_next_data_interval(dag_id) and get_run_data_interval(dag_run) give you the next and current data intervals respectively. So, all of those changes are more semantic changes than something else, but that makes the comprehension of DAG scheduling much easier and clearer than before.įrom Airflow 2.2, a scheduled DAG has always a data interval. The data_interval_start = the logical_date = the execution_date whereas the data_interval_end is the date at which the DAG is effectively triggered. Then, your DAG processes the last 24 hours of data.” Example In this example, the 00:00.īecause Airflow says, “if you want to process the data of the, then you need to wait for the 00:00 in order to have all the data. The execution date is NOT the date at which your DAG got triggered, but it is the date of the beginning of the data interval you want to process. Once your DAG is triggered, there is another concept to know, very confusing, the so called “execution date”. So, your DAG is effectively triggered, the 00:00 and not the 00:00. Now, VERY IMPORTANT, your DAG is triggered after the start date + the schedule interval. For example, which means, everyday at midnight. Represented either by a CRON expression or Timedelta object. This schedule interval defines the interval of time at which your DAG gets triggered. In addition to the start date, you need a schedule interval. Think of the start date as the start of the data interval you want to process. This date can be in the past or in the future. The start date is the date at which your DAG starts being scheduled. As a gift, here is quick reminder just for you. I spent hours just to understand how everything works. I don’t know about you but when I started to use Airflow for the first time, the concepts of scheduling interval, execution date, start date, end date, catchup and so on, were so confusing for me. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
January 2023
Categories |