Uploaded image for project: 'Falcon'
  1. Falcon
  2. FALCON-2302

Same partition gets exported over and over

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 0.10
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Hortonworks Data Platform 2.6.1

      Description

      I want to export a HCat/Hive table to Oracle. The table is partitioned by year month and day. As exprected, for everyday a oozie workflow is generated. but when I look at the

      sqoopCommand
      

      I can see it always exports the same partition (partition of feed start).

      The expected behaviour is, that for every day partition a workflow should be generated that exports that given partition.

      1. coordinator.xml
        8 kB
        Johannes Mayer
      2. oozie_job_configuration.xml
        6 kB
        Johannes Mayer
      3. OracleFeed.xml
        1 kB
        Johannes Mayer
      4. workflow.xml
        8 kB
        Johannes Mayer

        Activity

        Hide
        sandeep.samudrala sandeep samudrala added a comment - - edited

        Johannes Mayer : Can you add more details on the feed definition and the coord/wf configs that got created. If possible can you add the launcher job's configurations or launch time params that are getting executed.

        Show
        sandeep.samudrala sandeep samudrala added a comment - - edited Johannes Mayer : Can you add more details on the feed definition and the coord/wf configs that got created. If possible can you add the launcher job's configurations or launch time params that are getting executed.
        Hide
        joha0123 Johannes Mayer added a comment - - edited

        I attached the feed definition and coord/wf config.
        I think the problem is in coordinator.xml

        <property>
            <name>sqoopCommand</name>
            <value>export   --connect jdbc:oracle:thin:@192.168.145.210:1521:dmine --table HADOOP_TEST --username XYZ --password xxx --num-mappers 1 --update-mode allowinsert --skip-dist-cache --hcatalog-database ${coord:databaseIn('export-input')} --hcatalog-table ${coord:tableIn('export-input')} --hcatalog-partition-keys ds_year,ds_month,ds_day --hcatalog-partition-values  ${coord:dataInPartitionMin('export-input','ds_year')},${coord:dataInPartitionMin('export-input','ds_month')},${coord:dataInPartitionMin('export-input','ds_day')}  
            </value>
        </property>
        

        The values
        $

        {coord:dataInPartitionMin('export-input','ds_year')}

        ,$

        {coord:dataInPartitionMin('export-input','ds_month')}

        ,$

        {coord:dataInPartitionMin('export-input','ds_day')}

        are always 2017,05,17

        Thank you for your help!

        Show
        joha0123 Johannes Mayer added a comment - - edited I attached the feed definition and coord/wf config. I think the problem is in coordinator.xml <property> <name>sqoopCommand</name> <value>export --connect jdbc:oracle:thin:@192.168.145.210:1521:dmine --table HADOOP_TEST --username XYZ --password xxx --num-mappers 1 --update-mode allowinsert --skip-dist-cache --hcatalog-database ${coord:databaseIn('export-input')} --hcatalog-table ${coord:tableIn('export-input')} --hcatalog-partition-keys ds_year,ds_month,ds_day --hcatalog-partition-values ${coord:dataInPartitionMin('export-input','ds_year')},${coord:dataInPartitionMin('export-input','ds_month')},${coord:dataInPartitionMin('export-input','ds_day')} </value> </property> The values $ {coord:dataInPartitionMin('export-input','ds_year')} ,$ {coord:dataInPartitionMin('export-input','ds_month')} ,$ {coord:dataInPartitionMin('export-input','ds_day')} are always 2017,05,17 Thank you for your help!
        Hide
        joha0123 Johannes Mayer added a comment -

        in the oozie_job_configuration you can see, that nominalTime is 2017-07-22-10-30, but in the sqoopCommand it inserts 2017,05,17

        Show
        joha0123 Johannes Mayer added a comment - in the oozie_job_configuration you can see, that nominalTime is 2017-07-22-10-30, but in the sqoopCommand it inserts 2017,05,17
        Hide
        joha0123 Johannes Mayer added a comment - - edited

        I think i found the issue: in the coordinator.xml the input-event instance is hardcoded:

            <input-events>
                <data-in name="export-input" dataset="export-dataset">
                    <instance>2017-05-17T10:30Z</instance>
                </data-in>
            </input-events>
        

        where it should be something like:
        $

        {coord:current(0)}

        Please refer to:
        Oozie Documentation - CoordinatorSpec, see the example

        Show
        joha0123 Johannes Mayer added a comment - - edited I think i found the issue: in the coordinator.xml the input-event instance is hardcoded: <input-events> <data-in name= "export-input" dataset= "export-dataset" > <instance>2017-05-17T10:30Z</instance> </data-in> </input-events> where it should be something like: $ {coord:current(0)} Please refer to: Oozie Documentation - CoordinatorSpec , see the example
        Hide
        joha0123 Johannes Mayer added a comment -

        I was able to submit the job manually to oozie, and it works now as expected. I only changed the coordinator.xml as described in my previous comment.
        Do you think i can make this work somehow in my falcon installation on hortonworks?

        I am in the finishing touches of my master thesis and need this as soon as possible. Thank you very much

        Show
        joha0123 Johannes Mayer added a comment - I was able to submit the job manually to oozie, and it works now as expected. I only changed the coordinator.xml as described in my previous comment. Do you think i can make this work somehow in my falcon installation on hortonworks? I am in the finishing touches of my master thesis and need this as soon as possible. Thank you very much
        Hide
        sandeep.samudrala sandeep samudrala added a comment -

        Johannes Mayer: Thanks for all the information and debugging. You are right about the issue.
        I went through the code and found that this will need a code fix

        Show
        sandeep.samudrala sandeep samudrala added a comment - Johannes Mayer : Thanks for all the information and debugging. You are right about the issue. I went through the code and found that this will need a code fix

          People

          • Assignee:
            Unassigned
            Reporter:
            joha0123 Johannes Mayer
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development