Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-3161

Pipe "|" dilimiter is not working for streaming table

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • data-load
    • None

    Description

      csv data with "|" as a dilimiter is not getting loaded into streaming table correctly.

      DDL:

      create table table1_st(begintime TIMESTAMP, deviceid STRING, statcycle INT, topologypath STRING, devicetype STRING, rebootnum INT) stored by 'carbondata' TBLPROPERTIES('SORT_SCOPE'='GLOBAL_SORT','sort_columns'='deviceid,begintime','streaming' ='true');

      Run in spark shell:

      import org.apache.spark.sql.SparkSession;
      import org.apache.spark.sql.SparkSession.Builder;
      import org.apache.spark.sql.CarbonSession;
      import org.apache.spark.sql.CarbonSession.CarbonBuilder;
      import org.apache.spark.sql.streaming._
      import org.apache.carbondata.streaming.parser._

      val enableHiveSupport = SparkSession.builder().enableHiveSupport();
      val carbon=new CarbonBuilder(enableHiveSupport).getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/")
      val df=carbon.readStream.text("/user/*.csv")

      val qrymm_0001 = df.writeStream.format("carbondata").option(CarbonStreamParser.CARBON_STREAM_PARSER, CarbonStreamParser.CARBON_STREAM_PARSER_CSV).option("delimiter","|").option("header","false").option("dbName","stdb").option("checkpointLocation", "/tmp/tb1").option("bad_records_action","FORCE").option("tableName","table1_st").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("TIMESTAMPFORMAT","yyyy-dd-MM HH:mm:ss").start

       

      Sample records:
      begintime| deviceid| statcycle| topologypath| devicetype| rebootnum
      2018-10-01 00:00:00|Device1|0|dsad|STB|9
      2018-10-01 00:05:00|Device1|0|Rsad|STB|4
      2018-10-01 00:10:00|Device1|0|fsf|STB|6
      2018-10-01 00:15:00|Device1|0|fdgf|STB|8

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pawanmalwal Pawan Malwal
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h