Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.2
    • Fix Version/s: 0.7
    • Component/s: PHP - Library
    • Labels:
      None
    • Environment:

      Fedora 8, 64bit, php 5.2.4

      Description

      TException: TSocket: Could not write 48 bytes localhost:9160

      The error comes out often from TSocket#write($buf), row 279 at line 292:

      while (strlen($buf) > 0) {
      $got = @fwrite($this->handle_, $buf);
      if ($got === 0 || $got === FALSE) {
      $md = stream_get_meta_data($this->handle_);
      if ($md['timed_out'])

      { throw new TException('TSocket: timed out writing '.strlen($buf).' bytes from '. $this->host_.':'.$this->port_); }

      else

      { // HERE throw new TException('TSocket: Could not write '.strlen($buf).' bytes '. $this->host_.':'.$this->port_); }

      }
      $buf = substr($buf, $got);
      }

      I found a similar issue in reading TSocket, described here: https://issues.apache.org/jira/browse/THRIFT-347

        Activity

        Jake Farrell made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Nicholas Telford made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 0.7 [ 12315360 ]
        Resolution Duplicate [ 3 ]
        Hide
        Nicholas Telford added a comment -

        I believe this has been addressed in #867, and to an extent in #638. Marking as a duplicate since I'm pretty sure those tickets cover this issue appropriately.

        #638 re-wrote all the TSocket read/write code so if you're still experiencing problems post-0.7, can you open a new ticket?

        Show
        Nicholas Telford added a comment - I believe this has been addressed in #867, and to an extent in #638. Marking as a duplicate since I'm pretty sure those tickets cover this issue appropriately. #638 re-wrote all the TSocket read/write code so if you're still experiencing problems post-0.7, can you open a new ticket?
        Nicholas Telford made changes -
        Assignee Nicholas Telford [ nicktelford ]
        Bryan Duxbury made changes -
        Fix Version/s 0.6 [ 12315359 ]
        Bryan Duxbury made changes -
        Fix Version/s 0.6 [ 12315359 ]
        Fix Version/s 0.5 [ 12315235 ]
        Hide
        Loreto Parisi added a comment -

        Right!

        Show
        Loreto Parisi added a comment - Right!
        Hide
        Arya Goudarzi added a comment -

        Loreto,

        I worked on https://issues.apache.org/jira/browse/THRIFT-867. 887 is duplicate of 867.

        Show
        Arya Goudarzi added a comment - Loreto, I worked on https://issues.apache.org/jira/browse/THRIFT-867 . 887 is duplicate of 867.
        Hide
        Loreto Parisi added a comment -

        Just added https://issues.apache.org/jira/browse/THRIFT-887

        Thanks to Arya for deep investigations.

        Show
        Loreto Parisi added a comment - Just added https://issues.apache.org/jira/browse/THRIFT-887 Thanks to Arya for deep investigations.
        Hide
        Arya Goudarzi added a comment -

        To me it seams that the patch in THRIFT-837 breaks TFramedTransport. For example the recv_ methods end up getting this exception:

        PHP Fatal error: Uncaught exception 'TException' with message 'TSocket: Could not read 4 bytes from 10.50.26.132:9160' in /mnt/hgfs/NFS/code/vendors/thrift/transport/TSocket.php:231
        Stack trace:
        #0 /mnt/hgfs/NFS/code/vendors/thrift/transport/TFramedTransport.php(135): TSocket->readAll(4)
        #1 /mnt/hgfs/NFS/code/vendors/thrift/transport/TFramedTransport.php(102): TFramedTransport->readFrame()
        #2 [internal function]: TFramedTransport->read(8192)
        #3 /mnt/hgfs/NFS/code/vendors/thrift/packages/cassandra/Cassandra.php(752): thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated), 'cassandra_Cassa...', false)
        #4 /mnt/hgfs/NFS/code/vendors/thrift/packages/cassandra/Cassandra.php(727): CassandraClient->recv_batch_mutate()
        #5 /mnt/hgfs/NFS/code/private/classes/cassandra/base.php(334): CassandraClient->batch_mutate(Array, 1)
        #6 /mnt/hgfs/NFS/code/private/classes/cassandra/columnfamily.php(54): Cassandra->batchMutate('agoudarzi_Keysp...', Array, NULL)
        #7 /mnt/hgfs/NFS/code/dev/unit_tests/classes/cassandra/6_size_>8k_THRIFT-837.t(17) in /mnt/hgfs/NFS/code/vendors/thrift/transport/TSocket.php on line 231

        Not using the accelerated version works just fine. I agree that prior to this patch users will get TSocket timeout errors for inserts larger than 8k (thrift_protocol's default buffer size) but applying THRIFT-837 patch breaks communication in TFramedTransport.

        I have investigated this and I got the following suspicious points:

        1. thrift_protocol does a flush before direct write. Seams OK to flush and them direct write the rest of the data but I am not sure how the Thrift interface on the server side will treat the incomplete flushed buffer in FramedTransport. I say this because I compared the stream_metadata of TProtocol with TProtocolAccelerated and it clearly looks there is an extra flush of data before the large data gets written and flushed.

        2. Secondly, I wrote the data once using TProtocol and confirmed on the server side successfully, switched to Accelerated mode and tried to only read. I got exception coming out of thrift_protocol saying TType -114 is not value. I know this is an invalid type but I don't know how it got it.

        I think THRIFT-837 should be re-opened until fix is workable across different versions of Transport. I will comment more if I find something interesting.

        Show
        Arya Goudarzi added a comment - To me it seams that the patch in THRIFT-837 breaks TFramedTransport. For example the recv_ methods end up getting this exception: PHP Fatal error: Uncaught exception 'TException' with message 'TSocket: Could not read 4 bytes from 10.50.26.132:9160' in /mnt/hgfs/NFS/code/vendors/thrift/transport/TSocket.php:231 Stack trace: #0 /mnt/hgfs/NFS/code/vendors/thrift/transport/TFramedTransport.php(135): TSocket->readAll(4) #1 /mnt/hgfs/NFS/code/vendors/thrift/transport/TFramedTransport.php(102): TFramedTransport->readFrame() #2 [internal function] : TFramedTransport->read(8192) #3 /mnt/hgfs/NFS/code/vendors/thrift/packages/cassandra/Cassandra.php(752): thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated), 'cassandra_Cassa...', false) #4 /mnt/hgfs/NFS/code/vendors/thrift/packages/cassandra/Cassandra.php(727): CassandraClient->recv_batch_mutate() #5 /mnt/hgfs/NFS/code/private/classes/cassandra/base.php(334): CassandraClient->batch_mutate(Array, 1) #6 /mnt/hgfs/NFS/code/private/classes/cassandra/columnfamily.php(54): Cassandra->batchMutate('agoudarzi_Keysp...', Array, NULL) #7 /mnt/hgfs/NFS/code/dev/unit_tests/classes/cassandra/6_size_>8k_ THRIFT-837 .t(17) in /mnt/hgfs/NFS/code/vendors/thrift/transport/TSocket.php on line 231 Not using the accelerated version works just fine. I agree that prior to this patch users will get TSocket timeout errors for inserts larger than 8k (thrift_protocol's default buffer size) but applying THRIFT-837 patch breaks communication in TFramedTransport. I have investigated this and I got the following suspicious points: 1. thrift_protocol does a flush before direct write. Seams OK to flush and them direct write the rest of the data but I am not sure how the Thrift interface on the server side will treat the incomplete flushed buffer in FramedTransport. I say this because I compared the stream_metadata of TProtocol with TProtocolAccelerated and it clearly looks there is an extra flush of data before the large data gets written and flushed. 2. Secondly, I wrote the data once using TProtocol and confirmed on the server side successfully, switched to Accelerated mode and tried to only read. I got exception coming out of thrift_protocol saying TType -114 is not value. I know this is an invalid type but I don't know how it got it. I think THRIFT-837 should be re-opened until fix is workable across different versions of Transport. I will comment more if I find something interesting.
        Hide
        Bryan Duxbury added a comment -

        I've looked over php_thrift_protocol.cpp, and there are only two places that call directWrite, both of which seem to me to be doing the correct thing. (flush() calls directWrite with the buffer, and write() calls directWrite with the passed-in data after flushing if the amount to be written exceeds the size of the buffer.)

        Maybe I'm missing something, and I'm not the original author, but I don't see what's going wrong here. Maybe some stack traces would be nice?

        Show
        Bryan Duxbury added a comment - I've looked over php_thrift_protocol.cpp, and there are only two places that call directWrite, both of which seem to me to be doing the correct thing. (flush() calls directWrite with the buffer, and write() calls directWrite with the passed-in data after flushing if the amount to be written exceeds the size of the buffer.) Maybe I'm missing something, and I'm not the original author, but I don't see what's going wrong here. Maybe some stack traces would be nice?
        Hide
        Arya Goudarzi added a comment -

        The patch in THRIFT-837 broke all my reads too. Not sure why, investigating...

        Show
        Arya Goudarzi added a comment - The patch in THRIFT-837 broke all my reads too. Not sure why, investigating...
        Hide
        Loreto Parisi added a comment -

        I verified that this always happens when writing data larger than 8Kb.
        Also, it comes from the directWrite, as I was able to find out from the stack trace.
        I'm looking for the rigth line where the trace begins.

        Update asap.

        Show
        Loreto Parisi added a comment - I verified that this always happens when writing data larger than 8Kb. Also, it comes from the directWrite, as I was able to find out from the stack trace. I'm looking for the rigth line where the trace begins. Update asap.
        Hide
        Bryan Duxbury added a comment -

        Let me see if I understand this: You're saying that the change I applied to directWrite (THRIFT-837) causes you to start seeing write timeouts? If that's the case, then yes, that's probably because directWrite is being called incorrectly elsewhere. Anyone want to jump in and find out where?

        Show
        Bryan Duxbury added a comment - Let me see if I understand this: You're saying that the change I applied to directWrite ( THRIFT-837 ) causes you to start seeing write timeouts? If that's the case, then yes, that's probably because directWrite is being called incorrectly elsewhere. Anyone want to jump in and find out where?
        Hide
        safari added a comment - - edited

        +1. This patch worked for me. The problem happened when you write data larger than 8192 bytes.

        Show
        safari added a comment - - edited +1. This patch worked for me. The problem happened when you write data larger than 8192 bytes.
        Hide
        Pieter Maes added a comment -

        I have an simmular issue
        what i have found is (when using the php thift module)
        since the fix of directWrite in /ext/thrift_protocol/php_thrift_protocol.cpp
        ----------
        void directWrite(const char* data, size_t len) {
        zval writefn;
        ZVAL_STRING(&writefn, "write", 0);
        --char* newbuf = (char*)emalloc(buffer_used + 1);
        --memcpy(newbuf, buffer, buffer_used);
        --newbuf[buffer_used] = '\0';
        ++char* newbuf = (char*)emalloc(len + 1);
        ++memcpy(newbuf, data, len);
        ++newbuf[len] = '\0';
        zval *args[1];
        MAKE_STD_ZVAL(args[0]);
        --ZVAL_STRINGL(args[0], newbuf, buffer_used, 0);
        ++ZVAL_STRINGL(args[0], newbuf, len, 0);
        TSRMLS_FETCH();
        zval ret;
        ----------
        since this was in place, the write fails
        i guess it is called by with the wrong paramaters somewhere?
        so if you undo this (set len to buffer_used and data to buffer)
        it works

        Show
        Pieter Maes added a comment - I have an simmular issue what i have found is (when using the php thift module) since the fix of directWrite in /ext/thrift_protocol/php_thrift_protocol.cpp ---------- void directWrite(const char* data, size_t len) { zval writefn; ZVAL_STRING(&writefn, "write", 0); --char* newbuf = (char*)emalloc(buffer_used + 1); --memcpy(newbuf, buffer, buffer_used); --newbuf [buffer_used] = '\0'; ++char* newbuf = (char*)emalloc(len + 1); ++memcpy(newbuf, data, len); ++newbuf [len] = '\0'; zval *args [1] ; MAKE_STD_ZVAL(args [0] ); --ZVAL_STRINGL(args [0] , newbuf, buffer_used, 0); ++ZVAL_STRINGL(args [0] , newbuf, len, 0); TSRMLS_FETCH(); zval ret; ---------- since this was in place, the write fails i guess it is called by with the wrong paramaters somewhere? so if you undo this (set len to buffer_used and data to buffer) it works
        Bryan Duxbury made changes -
        Fix Version/s 0.5 [ 12315235 ]
        Fix Version/s 0.4 [ 12314960 ]
        Bryan Duxbury made changes -
        Field Original Value New Value
        Fix Version/s 0.4 [ 12314960 ]
        Fix Version/s 0.2 [ 12313769 ]
        Loreto Parisi created issue -

          People

          • Assignee:
            Nicholas Telford
            Reporter:
            Loreto Parisi
          • Votes:
            2 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development