Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      We are dealing with a task that requires casting from the BYTES type to BIGINT. Specifically, we have a string '00T1p'. Our approach is to convert this string to BYTES and then cast the result to BIGINT with the following SQL query:

      SELECT CAST((CAST('00T1p' as BYTES)) as BIGINT);

      However, an issue arises when executing this query, likely due to an error in the conversion between BYTES and BIGINT. We aim to identify and rectify this issue so our query can run correctly. The tasks involved are:

      1. Investigate and identify the specific reason for the failure of conversion from BYTES to BIGINT.
      2. Design and implement a solution to ensure our query can function correctly.
      3. Test this solution across all required scenarios to guarantee its functionality.

       

      see also

      1. PostgreSQL: PostgreSQL supports casting from BYTES type (BYTEA) to NUMBER types (INTEGER, BIGINT, DECIMAL, etc.). In PostgreSQL, you can use CAST or type conversion operator (::) for performing the conversion. URL: https://www.postgresql.org/docs/current/sql-expressions.html#SQL-SYNTAX-TYPE-CASTS

      2. MySQL: MySQL supports casting from BYTES type (BLOB or BINARY) to NUMBER types (INTEGER, BIGINT, DECIMAL, etc.). In MySQL, you can use CAST or CONVERT functions for performing the conversion. URL: https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html

      3. Microsoft SQL Server: SQL Server supports casting from BYTES type (VARBINARY, IMAGE) to NUMBER types (INT, BIGINT, NUMERIC, etc.). You can use CAST or CONVERT functions for performing the conversion. URL: https://docs.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql

      4. Oracle Database: Oracle supports casting from RAW type (equivalent to BYTES) to NUMBER types (NUMBER, INTEGER, FLOAT, etc.). You can use the TO_NUMBER function for performing the conversion. URL: https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/TO_NUMBER.html

      5. Apache Spark: Spark DataFrame supports casting binary types (BinaryType or ByteType) to numeric types (IntegerType, LongType, DecimalType, etc.) by using the cast function. URL: https://spark.apache.org/docs/latest/api/sql/#cast

       

      for the problem of bytes order may arise (little vs big endian). 

       

      1. Apache Hadoop: Hadoop, being an open-source framework, has to deal with byte order issues across different platforms and architectures. The Hadoop File System (HDFS) uses a technique called "sequence files," which include metadata to describe the byte order of the data. This metadata ensures that data is read and written correctly, regardless of the endianness of the platform.

      2. Apache Avro: Avro is a data serialization system used by various big data frameworks like Hadoop and Apache Kafka. Avro uses a compact binary encoding format that includes a marker for the byte order. This allows Avro to handle endianness issues seamlessly when data is exchanged between systems with different byte orders.

      3. Apache Parquet: Parquet is a columnar storage format used in big data processing frameworks like Apache Spark. Parquet uses a little-endian format for encoding numeric values, which is the most common format on modern systems. When reading or writing Parquet data, data processing engines typically handle any necessary byte order conversions transparently.

      4. Apache Spark: Spark is a popular big data processing engine that can handle data on distributed systems. It relies on the underlying data formats it reads (e.g., Avro, Parquet, ORC) to manage byte order issues. These formats are designed to handle byte order correctly, ensuring that Spark can handle data correctly on different platforms.

      5. Google Cloud BigQuery: BigQuery is a serverless data warehouse offered by Google Cloud. When dealing with binary data and endianness, BigQuery relies on the data encoding format. For example, when loading data in Avro or Parquet formats, these formats already include byte order information, allowing BigQuery to handle data across different platforms correctly.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hanyuzheng Hanyu Zheng
            hanyuzheng Hanyu Zheng

            Dates

              Created:
              Updated:

              Slack

                Issue deployment