Oozie
  1. Oozie
  2. OOZIE-751

Sqoop jobs through oozie hangs if I try to load 3 or more table in parallel

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: 3.2.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      CentOs 5.0, hadoop-0.20.2, sqoop-1.3.0, oozie-2.3.2

      Description

      I want to load data from SQL Server to HDFS and am using the sqoop action of Oozie as defined on page http://archive.cloudera.com/cdh/3/oozie-2.3.2-cdh3u3/DG_SqoopActionExtension.html.

      It works when I try to copy 1 table but when I try to copy 3 or more tables in parallel then the job just hangs. I don't see any errors anywhere in the logs.

      • I have confirmed that there are no deadlocks on the database side.
      • I have confirmed that if I try to load multiple table in parallel using sqoop command line then it works fine

      It looks like there is something in oozie sqoop action.

      One more thing that I noticed is that there are 3 oozie jobs running in the oozie console but 6 jobs are shown in Jobtracker UI (please see screenshot attached). Not sure why that is.

      The workflow.xml file, tasktracker logs for the task and how oozie directory looks on HDFS is attached.

      1. how_3_oozie_jobs_look_in_jobtracker_ui.png
        111 kB
        Aman Preet Singh
      2. job_201202161642_33931_taskdetailshistory.jsp.htm
        9 kB
        Aman Preet Singh
      3. tasklog.htm
        81 kB
        Aman Preet Singh
      4. this_is_how_oozie_directory_structure_looks_in_hdfs.txt
        3 kB
        Aman Preet Singh
      5. workflow.xml
        2 kB
        Aman Preet Singh

        Activity

        Aman Preet Singh created issue -
        Aman Preet Singh made changes -
        Field Original Value New Value
        Attachment tasklog.htm [ 12517774 ]
        Aman Preet Singh made changes -
        Attachment workflow.xml [ 12517775 ]
        Aman Preet Singh made changes -
        Aman Preet Singh made changes -
        Hide
        Alejandro Abdelnur added a comment -

        Does your cluster have enough slots to run all those jobs concurrently?

        Can you run your sqoop commands in parallel from the command line (from different terminals)?

        oozie uses as launcher job (MR job for 1 map task and 0 reduce job) to launch the real job (the sqoop job in your case), you can see the job names being oozie:launcher and oozie:action

        Show
        Alejandro Abdelnur added a comment - Does your cluster have enough slots to run all those jobs concurrently? Can you run your sqoop commands in parallel from the command line (from different terminals)? oozie uses as launcher job (MR job for 1 map task and 0 reduce job) to launch the real job (the sqoop job in your case), you can see the job names being oozie:launcher and oozie:action
        Hide
        Aman Preet Singh added a comment -

        Yes. The cluster is set to run 100 Mappers and it has more than 60 map slots free.

        Yes, I am able to run sqoop commands in parallel from the command line. I tried loading 6 tables in parallel and it runs fine.

        Yes, I can see oozie:launcher and oozie:action (attachment how_3_oozie_jobs_look_in_jobtracker_ui.png )

        Show
        Aman Preet Singh added a comment - Yes. The cluster is set to run 100 Mappers and it has more than 60 map slots free. Yes, I am able to run sqoop commands in parallel from the command line. I tried loading 6 tables in parallel and it runs fine. Yes, I can see oozie:launcher and oozie:action (attachment how_3_oozie_jobs_look_in_jobtracker_ui.png )
        Aman Preet Singh made changes -
        Aman Preet Singh made changes -
        Summary Sqoop jobs through oozie hangs if I try to load more than 1 table in parallel Sqoop jobs through oozie hangs if I try to load 3 or more table in parallel
        Aman Preet Singh made changes -
        Description I want to load data from SQL Server to HDFS and am using the sqoop action of Oozie as defined on page http://archive.cloudera.com/cdh/3/oozie-2.3.2-cdh3u3/DG_SqoopActionExtension.html.

        It works when I try to copy 1 table but when I try to copy more than 1 table is parallel then the job just hangs. I don't see any errors anywhere in the logs.

        - I have confirmed that there are no deadlocks on the database side.
        - I have confirmed that if I try to load multiple table in parallel using sqoop command line then it works fine

        It looks like there is something in oozie sqoop action.

        One more thing that I noticed is that there are 3 oozie jobs running in the oozie console but 6 jobs are shown in Jobtracker UI (please see screenshot attached). Not sure why that is.

        The workflow.xml file, tasktracker logs for the task and how oozie directory looks on HDFS is attached.


         


        I want to load data from SQL Server to HDFS and am using the sqoop action of Oozie as defined on page http://archive.cloudera.com/cdh/3/oozie-2.3.2-cdh3u3/DG_SqoopActionExtension.html.

        It works when I try to copy 1 table but when I try to copy 3 or more tables in parallel then the job just hangs. I don't see any errors anywhere in the logs.

        - I have confirmed that there are no deadlocks on the database side.
        - I have confirmed that if I try to load multiple table in parallel using sqoop command line then it works fine

        It looks like there is something in oozie sqoop action.

        One more thing that I noticed is that there are 3 oozie jobs running in the oozie console but 6 jobs are shown in Jobtracker UI (please see screenshot attached). Not sure why that is.

        The workflow.xml file, tasktracker logs for the task and how oozie directory looks on HDFS is attached.


         


        Hide
        Harsh J added a comment -

        As Alejandro also pointed out, and as your environment suggests having a running job limit, you were running into OOZIE-9

        Show
        Harsh J added a comment - As Alejandro also pointed out, and as your environment suggests having a running job limit, you were running into OOZIE-9
        Harsh J made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Harsh J made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Hide
        Harsh J added a comment -

        Reopening to close with proper resolution.

        Show
        Harsh J added a comment - Reopening to close with proper resolution.
        Harsh J made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Not A Problem [ 8 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Aman Preet Singh
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development