Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-1405

Problem referencing lower-case column names with Phoenix / Pig / Spark

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.0
    • Fix Version/s: 4.3.0, 3.3.0, 4.2.1, 3.2.1
    • Labels:
      None

      Description

      Given the following table definition:

      CREATE TABLE "mytable" (
        "id" VARCHAR NOT NULL
        CONSTRAINT pk PRIMARY KEY ("id")
      ) SALT_BUCKETS=16
      

      And the following code setting up a PhoenixPigConfiguration:

      val phoenixConf = new PhoenixPigConfiguration(new Configuration())
      
      phoenixConf.setSelectStatement("SELECT \"id\" FROM \"mytable\"")
      phoenixConf.setSelectColumns("id")
      phoenixConf.setSchemaType(SchemaType.QUERY)
      phoenixConf.configure("127.0.0.1", "\"mytable\"", 100)
      val phoenixRDD = sc.newAPIHadoopRDD(phoenixConf.getConfiguration,
        classOf[PhoenixInputFormat],
        classOf[NullWritable],
        classOf[PhoenixRecord])
      

      The above seems to work, but when I later call phoenixConf.getSelectColumnMetadataList, I get the following error:

        java.sql.SQLException: Unable to resolve these column names:
      id
      Available columns with column families:
      _SALT,id
        at org.apache.phoenix.util.PhoenixRuntime.generateColumnInfo(PhoenixRuntime.java:354)
        at org.apache.phoenix.pig.PhoenixPigConfiguration$PhoenixPigConfigurationUtil.getSelectColumnMetadataList(PhoenixPigConfiguration.java:269)
        at org.apache.phoenix.pig.PhoenixPigConfiguration.getSelectColumnMetadataList(PhoenixPigConfiguration.java:157)
        at com.simplymeasured.spark.PhoenixRDD.toSchemaRDD(PhoenixRDD.scala:52)
        at com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply$mcV$sp(PhoenixRDDTest.scala:35)
        at com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
        at com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
        at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
        at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
        at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
      

      Looking at PhoenixRuntime, within getColumnInfo(), it's performing a trim().toUpperCase(), which doesn't seem valid: https://github.com/apache/phoenix/blob/3.0/phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java#L374

      I'm attempting to use this from within Spark, and I would like to rely on getSelectColumnMetadataList to build a Schema RDD.

        Attachments

        1. PHOENIX-1405_V2.patch
          0.8 kB
          Samarth Jain
        2. PHOENIX-1405_V3.patch
          5 kB
          James R. Taylor
        3. PHOENIX-1405_V4.patch
          3 kB
          James R. Taylor
        4. PHOENIX-1405.patch
          1 kB
          Robert Roland

          Activity

            People

            • Assignee:
              robertroland Robert Roland
              Reporter:
              robertroland Robert Roland
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: