Uploaded image for project: 'Apache Hop'
  1. Apache Hop
  2. HOP-3984

Not getting complete data in output while running on spark engine

    XMLWordPrintableJSON

Details

    Description

      While running a simple pipeline having txt input and txt output on spark, the pipeline is not able to write complete output to the "output file".

      How to reproduce:

      1) Create a simple pipeline having 2 transforms text file input and text file output

      2) Use any simple csv/txt file in Text input file transform

      3) Write the data to a text/csv file using text file output transform

      If we are reading x lines in #2, then we will get y lines in #3 where x > y. 

      As we don't have any intermediate transforms in this pipeline, there should not be any change in the output i.e. x should equal to y.

      The output still won't match if we use zipped input or zipped output or use any other option in input/output/execution window.

       

      Attaching:

      1) simple pipeline - hop_pipeline_simple.hpl,

      2) Pipeline with different scenarios - hop_pipeline_multiple_scenarios.hpl,

      3) Input files: names.txtnames.zip  and simple_mapping_output.txt_20220608_122959.txt  

      4) Output file - simple_mapping_output_2_20220608_172947.txt.

      Attachments

        1. image-2022-06-08-18-08-06-655.png
          6 kB
          Utkarsh Singhal
        2. names.zip
          0.3 kB
          Utkarsh Singhal
        3. names.txt
          38 kB
          Utkarsh Singhal
        4. hop_pipeline_multiple_scenarios.hpl
          7 kB
          Utkarsh Singhal
        5. hop_pipeline_simple.hpl
          33 kB
          Utkarsh Singhal
        6. simple_mapping_output_2_20220608_172947.txt
          59 kB
          Utkarsh Singhal
        7. simple_mapping_output.txt_20220608_122959.txt
          97 kB
          Utkarsh Singhal

        Activity

          People

            mcasters Matt Casters
            utkarshutr Utkarsh Singhal
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: