Uploaded image for project: 'Apache Airflow'
  1. Apache Airflow
  2. AIRFLOW-6609

Airflow upgradedb fails serialized_dag table add on revision id d38e04c12aa2

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.10.7
    • Fix Version/s: None
    • Component/s: database
    • Labels:

      Description

      We're attempting an upgrade from 1.10.3 to 1.10.7 to use some of the great features available in later revisions; however, the upgrade from 1.10.6 to 1.10.7 is causing some heartburn.

      Runtime environment:

      • Docker containers for each runtime segment (webserver, scheduler, flower, postgres, redis, worker)
      • Using CeleryExecutor queued with Redis
      • Using Postgres backend

       

      Steps to reproduce:

      1. Author base images relating to each version of Airflow between 1.10.3 and 1.10.7 (if you want the full regression we have done)
      2. 'airflow initdb' on revision 1.10.3
      3. Start up the containers, run some dags, produce metadata
      4. Increment / swap out base image revision from 1.10.3 base to 1.10.4 base image
      5. Run 'airflow upgradedb'
      6. Validate success
      n. Eventually you will get to the 1.10.6 revision, stepping up to 1.10.7, which produces the error below

       

      INFO  [alembic.runtime.migration] Running upgrade 6e96a59344a4 -> d38e04c12aa2, add serialized_dag table
      Revision ID: d38e04c12aa2
      Revises: 6e96a59344a4
      Create Date: 2019-08-01 14:39:35.616417
      Traceback (most recent call last):
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
          cursor, statement, parameters, context
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
          cursor.execute(statement, parameters)
      psycopg2.errors.DuplicateTable: relation "serialized_dag" already exists
      The above exception was the direct cause of the following exception:Traceback (most recent call last):
        File "/opt/anaconda/miniconda3/envs/airflow/bin/airflow", line 37, in <module>
          args.func(args)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/utils/cli.py", line 75, in wrapper
          return f(*args, **kwargs)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/bin/cli.py", line 1193, in upgradedb
          db.upgradedb()
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/utils/db.py", line 376, in upgradedb
          command.upgrade(config, 'heads')
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/command.py", line 298, in upgrade
          script.run_env()
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/script/base.py", line 489, in run_env
          util.load_python_file(self.dir, "env.py")
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/util/pyfiles.py", line 98, in load_python_file
          module = load_module_py(module_id, path)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/util/compat.py", line 173, in load_module_py
          spec.loader.exec_module(module)
        File "<frozen importlib._bootstrap_external>", line 678, in exec_module
        File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/migrations/env.py", line 96, in <module>
          run_migrations_online()
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/migrations/env.py", line 90, in run_migrations_online
          context.run_migrations()
        File "<string>", line 8, in run_migrations
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/runtime/environment.py", line 846, in run_migrations
          self.get_context().run_migrations(**kw)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/runtime/migration.py", line 518, in run_migrations
          step.migration_fn(**kw)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/migrations/versions/d38e04c12aa2_add_serialized_dag_table.py", line 54, in upgrade
          sa.PrimaryKeyConstraint('dag_id'))
        File "<string>", line 8, in create_table
        File "<string>", line 3, in create_table
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/operations/ops.py", line 1250, in create_table
          return operations.invoke(op)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/operations/base.py", line 345, in invoke
          return fn(self, operation)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/operations/toimpl.py", line 101, in create_table
          operations.impl.create_table(table)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/ddl/impl.py", line 252, in create_table
          self._exec(schema.CreateTable(table))
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/ddl/impl.py", line 134, in _exec
          return conn.execute(construct, *multiparams, **params)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 982, in execute
          return meth(self, multiparams, params)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py", line 72, in _execute_on_connection
          return connection._execute_ddl(self, multiparams, params)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1044, in _execute_ddl
          compiled,
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1250, in _execute_context
          e, statement, parameters, cursor, context
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
          util.raise_from_cause(sqlalchemy_exception, exc_info)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
          reraise(type(exception), exception, tb=exc_tb, cause=cause)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
          raise value.with_traceback(tb)
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
          cursor, statement, parameters, context
        File "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
          cursor.execute(statement, parameters)
      sqlalchemy.exc.ProgrammingError: (psycopg2.errors.DuplicateTable) relation "serialized_dag" already exists[SQL:
      CREATE TABLE serialized_dag (
      	dag_id VARCHAR(250) NOT NULL,
      	fileloc VARCHAR(2000) NOT NULL,
      	fileloc_hash INTEGER NOT NULL,
      	data JSON NOT NULL,
      	last_updated TIMESTAMP WITHOUT TIME ZONE NOT NULL,
      	PRIMARY KEY (dag_id)
      )]
      (Background on this error at: http://sqlalche.me/e/f405)
      

       

       

      It doesn't make much sense seeing only one reference to this table addition in the codebase so... not sure why this migration is going awry.

      Possible solutions:

      • Instead of bailing out, it may be more productive to issue warnings when these things fail instead. The intent of the migration process is to say 'you can't run on version x' but here I'm more confused about the migration outcome.
      • Migrations could check ahead for patches being applied ahead of revision (we did this for a bug found in later revisions, for a different backend MSSQL); this could add more overhead but metadata upgrades could at least be then self-aware
      • Something else I'm missing in the broader picture

       

      If the db truly already has the table, end users would still be able to upgrade their version, so it's kind of odd to have an error changing revisions.. if things are already in place for the future revision.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              cschmautz Chris Schmautz
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: