Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0
    • Component/s: php
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I've developed a PHP library for Avro against the Avro 1.3.3 spec as part
      of using Avro at myYearbook.com. It's based on the Python and Ruby
      libraries. Current functionality includes schema parsing, Avro data file and
      string IO.

      Known issues:

      • Support for field defaults is incomplete.
      • Lacks RPC and protocol support, JSON encoding, and deflate codec.
      1. AVRO-627.patch.4.zip
        34 kB
        Michael Glaesemann
      2. AVRO-627.patch.3.zip
        31 kB
        Michael Glaesemann
      3. AVRO-627.patch.2.zip
        31 kB
        Michael Glaesemann
      4. AVRO-627.patch.zip
        31 kB
        Michael Glaesemann

        Activity

        Hide
        Michael Glaesemann added a comment -

        Greate! Thanks for your help, Doug!

        Show
        Michael Glaesemann added a comment - Greate! Thanks for your help, Doug!
        Hide
        Doug Cutting added a comment -

        I committed this. Thanks, Michael!

        I made a few changes to the build.sh dist implementation, namely:

        • fixed broken README copy
        • added NOTICE & LICENSE
        • stopped copying .svn files from examples
        • moved tarball from ../../build to ../../dist
        Show
        Doug Cutting added a comment - I committed this. Thanks, Michael! I made a few changes to the build.sh dist implementation, namely: fixed broken README copy added NOTICE & LICENSE stopped copying .svn files from examples moved tarball from ../../build to ../../dist
        Hide
        Michael Glaesemann added a comment -

        Updated to use GMP PHP extension on 32-bit platforms when it's available, raising an exception if it's not available. 64-bit platforms use the native PHP operations even if GMP is available. Tested on Mac OS X 64-bit, Ubuntu 64-bit (without GMP), Ubuntu 32-bit with GMP, and CentOS 5 64-bit.

        Assuming things look good, any chance of this getting rolled into the 1.4.0? I know you've already cut the RC, but I thought I'd ask anyway.

        Show
        Michael Glaesemann added a comment - Updated to use GMP PHP extension on 32-bit platforms when it's available, raising an exception if it's not available. 64-bit platforms use the native PHP operations even if GMP is available. Tested on Mac OS X 64-bit, Ubuntu 64-bit (without GMP), Ubuntu 32-bit with GMP, and CentOS 5 64-bit. Assuming things look good, any chance of this getting rolled into the 1.4.0? I know you've already cut the RC, but I thought I'd ask anyway.
        Hide
        Michael Glaesemann added a comment -

        That makes sense. I'll take a look.

        Show
        Michael Glaesemann added a comment - That makes sense. I'll take a look.
        Hide
        Doug Cutting added a comment -

        > the json_decode function in current releases of PHP casts JSON long values to floats on 32-bit platforms.

        The only longs in JSON should be default values, no? The lack of the ability to accept large default values for longs would be a bug, but probably not many would encounter it. The value of longs in datastructures could use gmp, no? Or am I missing something?

        Show
        Doug Cutting added a comment - > the json_decode function in current releases of PHP casts JSON long values to floats on 32-bit platforms. The only longs in JSON should be default values, no? The lack of the ability to accept large default values for longs would be a bug, but probably not many would encounter it. The value of longs in datastructures could use gmp, no? Or am I missing something?
        Hide
        Michael Glaesemann added a comment -

        I put the 64-bit check in deliberately. As I understand it, the json_decode function in current releases of PHP casts JSON long values to floats on 32-bit platforms. It looks like a future release will allow representing longs as strings on 32-bit via a JSON_BIGINT_AS_STRING option, but it's not available now, unfortunately. So even if I did my PHP bit-twiddling using the GMP library, I wouldn't be able to get the actual value out of the JSON, unless I parsed the JSON myself.

        I'll build a 32-bit VM and see if I can figure out something.

        Another option would be to start from scratch and build the PHP lib using the C library as an extension, though when I first took a look at that, it didn't look feasible.

        Any other ideas? I'm open to suggestions.

        Show
        Michael Glaesemann added a comment - I put the 64-bit check in deliberately. As I understand it, the json_decode function in current releases of PHP casts JSON long values to floats on 32-bit platforms. It looks like a future release will allow representing longs as strings on 32-bit via a JSON_BIGINT_AS_STRING option , but it's not available now, unfortunately. So even if I did my PHP bit-twiddling using the GMP library, I wouldn't be able to get the actual value out of the JSON, unless I parsed the JSON myself. I'll build a 32-bit VM and see if I can figure out something. Another option would be to start from scratch and build the PHP lib using the C library as an extension, though when I first took a look at that, it didn't look feasible. Any other ideas? I'm open to suggestions.
        Hide
        Doug Cutting added a comment -

        Now tests fail with "This is not a 64-bit platform". It looks like PHP only supports 32-bit integers on 32-bit hardware like my laptop.

        Might we use gmp for longs? From my reading, a PHP float is a C double, while a PHP int is a C long, which is 32- or 64-bits, depending on the architecture.

        Show
        Doug Cutting added a comment - Now tests fail with "This is not a 64-bit platform". It looks like PHP only supports 32-bit integers on 32-bit hardware like my laptop. Might we use gmp for longs? From my reading, a PHP float is a C double, while a PHP int is a C long, which is 32- or 64-bits, depending on the architecture.
        Hide
        Michael Glaesemann added a comment -

        AVRO-627.patch.3.zip adds support for defaults and supersedes all previous patches.

        Show
        Michael Glaesemann added a comment - AVRO-627 .patch.3.zip adds support for defaults and supersedes all previous patches.
        Hide
        Michael Glaesemann added a comment -

        I set up an Ubuntu VM and it looks like the failures are due to stricter PHP error checking the Ubuntu install (E_ALL) relative to the Mac OS X install (E_ALL & ~E_NOTICE). I've tightened up the code and all the tests pass for me on both Mac OS X and Ubuntu. Updated patch attached.

        Show
        Michael Glaesemann added a comment - I set up an Ubuntu VM and it looks like the failures are due to stricter PHP error checking the Ubuntu install (E_ALL) relative to the Mac OS X install (E_ALL & ~E_NOTICE). I've tightened up the code and all the tests pass for me on both Mac OS X and Ubuntu. Updated patch attached.
        Hide
        Doug Cutting added a comment -

        > Are you running the stock Ubuntu 10.04 PHP?

        Yes. I didn't previously have PHP installed and got whatever 'sudo apt-get install phpunit' installed:

        Setting up php5-common (5.3.2-1ubuntu4.2) ...
        Setting up php5-cli (5.3.2-1ubuntu4.2) ...
        
        Creating config file /etc/php5/cli/php.ini with new version
        update-alternatives: using /usr/bin/php5 to provide /usr/bin/php (php) in auto mode.
        
        Setting up php-pear (5.3.2-1ubuntu4.2) ...
        Setting up php-benchmark (1.2.7-4) ...
        Setting up phpunit (3.4.5-1) ...
        
        Show
        Doug Cutting added a comment - > Are you running the stock Ubuntu 10.04 PHP? Yes. I didn't previously have PHP installed and got whatever 'sudo apt-get install phpunit' installed: Setting up php5-common (5.3.2-1ubuntu4.2) ... Setting up php5-cli (5.3.2-1ubuntu4.2) ... Creating config file /etc/php5/cli/php.ini with new version update-alternatives: using /usr/bin/php5 to provide /usr/bin/php (php) in auto mode. Setting up php-pear (5.3.2-1ubuntu4.2) ... Setting up php-benchmark (1.2.7-4) ... Setting up phpunit (3.4.5-1) ...
        Hide
        Michael Glaesemann added a comment -

        Thanks for trying this out so quickly.

        I'll take a look at these. All the tests pass for me on Mac OS X 10.6.4, PHP 5.3.3RC3 (cli) (built: Aug 19 2010 13:58:08).

        Are you running the stock Ubuntu 10.04 PHP?

        Show
        Michael Glaesemann added a comment - Thanks for trying this out so quickly. I'll take a look at these. All the tests pass for me on Mac OS X 10.6.4, PHP 5.3.3RC3 (cli) (built: Aug 19 2010 13:58:08). Are you running the stock Ubuntu 10.04 PHP?
        Hide
        Doug Cutting added a comment -

        This looks great!

        I installed phpunit on Ubuntu 10.04 and was able to run tests.

        We should also update the top-level README.txt, listing PHP requirements.

        Tests however failed with:

        There were 2 failures:
        
        1) SchemaTest::test_parse with data set #38 (SchemaExample)
        schema_string: 
            {"type": "array",
             "items": {"type": "enum", "name": "Test", "symbols": ["A", "B"]}}
            
        Sub-schema for AvroArraySchema not a valid Avro schema. Bad schema: Array
        (
            [type] => enum
            [name] => Test
            [symbols] => Array
                (
                    [0] => A
                    [1] => B
                )
        
        )
        
        Failed asserting that <boolean:true> is false.
        
        /home/cutting/src/avro/trunk/lang/php/test/SchemaTest.php:459
        
        2) SchemaTest::test_parse with data set #40 (SchemaExample)
        schema_string: 
            {"type": "map",
             "values": {"type": "enum", "name": "Test", "symbols": ["A", "B"]}}
            
        Sub-schema for AvroMapSchema not a valid Avro schema. Bad schema: Array
        (
            [type] => enum
            [name] => Test
            [symbols] => Array
                (
                    [0] => A
                    [1] => B
                )
        
        )
        
        Failed asserting that <boolean:true> is false.
        
        /home/cutting/src/avro/trunk/lang/php/test/SchemaTest.php:459
        
        FAILURES!
        Tests: 189, Assertions: 261, Failures: 2, Errors: 89, Incomplete: 3.
        

        Any idea what this could be?

        It'd be great to include this in 1.4.0, which I hope to get out in the next few days.

        Show
        Doug Cutting added a comment - This looks great! I installed phpunit on Ubuntu 10.04 and was able to run tests. We should also update the top-level README.txt, listing PHP requirements. Tests however failed with: There were 2 failures: 1) SchemaTest::test_parse with data set #38 (SchemaExample) schema_string: { "type" : "array" , "items" : { "type" : " enum " , "name" : "Test" , "symbols" : [ "A" , "B" ]}} Sub-schema for AvroArraySchema not a valid Avro schema. Bad schema: Array ( [type] => enum [name] => Test [symbols] => Array ( [0] => A [1] => B ) ) Failed asserting that < boolean : true > is false . /home/cutting/src/avro/trunk/lang/php/test/SchemaTest.php:459 2) SchemaTest::test_parse with data set #40 (SchemaExample) schema_string: { "type" : "map" , "values" : { "type" : " enum " , "name" : "Test" , "symbols" : [ "A" , "B" ]}} Sub-schema for AvroMapSchema not a valid Avro schema. Bad schema: Array ( [type] => enum [name] => Test [symbols] => Array ( [0] => A [1] => B ) ) Failed asserting that < boolean : true > is false . /home/cutting/src/avro/trunk/lang/php/test/SchemaTest.php:459 FAILURES! Tests: 189, Assertions: 261, Failures: 2, Errors: 89, Incomplete: 3. Any idea what this could be? It'd be great to include this in 1.4.0, which I hope to get out in the next few days.
        Hide
        Michael Glaesemann added a comment -

        This includes updating the root build.sh script to

        • run the PHP unit tests
        • generate an interop avro file for PHP
        • test PHP interop, and
        • build a PHP distribution package.

        The distribution and interop file generation have no additional dependencies (other than PHP). The tests require PHPUnit <http://www.phpunit.de/>

        Show
        Michael Glaesemann added a comment - This includes updating the root build.sh script to run the PHP unit tests generate an interop avro file for PHP test PHP interop, and build a PHP distribution package. The distribution and interop file generation have no additional dependencies (other than PHP). The tests require PHPUnit < http://www.phpunit.de/ >

          People

          • Assignee:
            Michael Glaesemann
            Reporter:
            Michael Glaesemann
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development