Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-1405

Problem referencing lower-case column names with Phoenix / Pig / Spark

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 4.3.0, 3.3.0, 4.2.1, 3.2.1
    • None
    • None

    Description

      Given the following table definition:

      CREATE TABLE "mytable" (
        "id" VARCHAR NOT NULL
        CONSTRAINT pk PRIMARY KEY ("id")
      ) SALT_BUCKETS=16
      

      And the following code setting up a PhoenixPigConfiguration:

      val phoenixConf = new PhoenixPigConfiguration(new Configuration())
      
      phoenixConf.setSelectStatement("SELECT \"id\" FROM \"mytable\"")
      phoenixConf.setSelectColumns("id")
      phoenixConf.setSchemaType(SchemaType.QUERY)
      phoenixConf.configure("127.0.0.1", "\"mytable\"", 100)
      val phoenixRDD = sc.newAPIHadoopRDD(phoenixConf.getConfiguration,
        classOf[PhoenixInputFormat],
        classOf[NullWritable],
        classOf[PhoenixRecord])
      

      The above seems to work, but when I later call phoenixConf.getSelectColumnMetadataList, I get the following error:

        java.sql.SQLException: Unable to resolve these column names:
      id
      Available columns with column families:
      _SALT,id
        at org.apache.phoenix.util.PhoenixRuntime.generateColumnInfo(PhoenixRuntime.java:354)
        at org.apache.phoenix.pig.PhoenixPigConfiguration$PhoenixPigConfigurationUtil.getSelectColumnMetadataList(PhoenixPigConfiguration.java:269)
        at org.apache.phoenix.pig.PhoenixPigConfiguration.getSelectColumnMetadataList(PhoenixPigConfiguration.java:157)
        at com.simplymeasured.spark.PhoenixRDD.toSchemaRDD(PhoenixRDD.scala:52)
        at com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply$mcV$sp(PhoenixRDDTest.scala:35)
        at com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
        at com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
        at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
        at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
        at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
      

      Looking at PhoenixRuntime, within getColumnInfo(), it's performing a trim().toUpperCase(), which doesn't seem valid: https://github.com/apache/phoenix/blob/3.0/phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java#L374

      I'm attempting to use this from within Spark, and I would like to rely on getSelectColumnMetadataList to build a Schema RDD.

      Attachments

        1. PHOENIX-1405.patch
          1 kB
          Robert Roland
        2. PHOENIX-1405_V2.patch
          0.8 kB
          Samarth Jain
        3. PHOENIX-1405_V3.patch
          5 kB
          James R. Taylor
        4. PHOENIX-1405_V4.patch
          3 kB
          James R. Taylor

        Activity

          People

            robertroland Robert Roland
            robertroland Robert Roland
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: