Solr
  1. Solr
  2. SOLR-1549

SqlEntityProcessor does not recognize onError attribute

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: 1.4, 1.5
    • Fix Version/s: None
    • Labels:
      None

      Description

      Unfortunately, the SqlEntityProcessor does not recognize the value of an entity's onError attribute in DIH's data config file. Therefore, in cases where SQL exceptions are thrown somewhere inside the constructor of ResultSetIterators (which is an inner class of JdbcDataSource), Solr's import exits immediately, even though onError is set to continue or skip.

      In my opinion, use cases exist that will profit from database related exception handling inside of Solr (e.g., in cases where the existence of certain database tables or views is not predictable).

      1. SOLR-1549.patch
        2 kB
        Alexander Kanarsky

        Activity

        Hide
        Alexander Kanarsky added a comment - - edited

        James, just to clarify the ratio behind the patch - in my case it was a MySQL problem of 'zero' timestamps, when the TIMESTAMP field with zero value was causing this exception:

        SQLException: Cannot convert value '0000-00-00 00:00:00' from column 5 to TIMESTAMP.

        Clearly a data data issue with valid SQL; so I did the patch to skip a few documents with such timestamps rather than fail the whole full import. However, I learned later that MySQL connector has a 'zeroDateTimeBehavior' connection option that could be set to 'convertToNull' rater than default 'exception' so this solved the problem as well.

        But I agree that there are very few cases like that when you might want to continue in case of SqlException.

        Show
        Alexander Kanarsky added a comment - - edited James, just to clarify the ratio behind the patch - in my case it was a MySQL problem of 'zero' timestamps, when the TIMESTAMP field with zero value was causing this exception: SQLException: Cannot convert value '0000-00-00 00:00:00' from column 5 to TIMESTAMP. Clearly a data data issue with valid SQL; so I did the patch to skip a few documents with such timestamps rather than fail the whole full import. However, I learned later that MySQL connector has a 'zeroDateTimeBehavior' connection option that could be set to 'convertToNull' rater than default 'exception' so this solved the problem as well. But I agree that there are very few cases like that when you might want to continue in case of SqlException.
        James Dyer made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Won't Fix [ 2 ]
        Hide
        James Dyer added a comment -

        Closing for now as this issue has not generated interest. It doesn't seem wise to try to continue if there is an SqlException.

        Show
        James Dyer added a comment - Closing for now as this issue has not generated interest. It doesn't seem wise to try to continue if there is an SqlException.
        Hide
        James Dyer added a comment -

        The patch looks simple and straightforward but I question the wisdom of allowing anything else but "abort" in the case of an SqlException. You can't simply continue and read the next row of the query when this happens. My thinking is "onAbort" is more for handling whole text documents that for some reason don't parse correctly.

        I guess if it was a child entity doing n+1 selects it could just go do the next select with the following document. Perhaps if the parent entity sometimes passes it a wrongly-typed join value this would happen and it would be ok to go on, assuming the next doc would have a correct join value? But then again, you could just craft your sql to handle this contingency in these cases.

        So if Sascha, Alexander or anyone can explain better why this is a good idea, and possibly contribute a unit test then perhaps we can commit this. Otherwise I think its a "won't fix".

        Show
        James Dyer added a comment - The patch looks simple and straightforward but I question the wisdom of allowing anything else but "abort" in the case of an SqlException. You can't simply continue and read the next row of the query when this happens. My thinking is "onAbort" is more for handling whole text documents that for some reason don't parse correctly. I guess if it was a child entity doing n+1 selects it could just go do the next select with the following document. Perhaps if the parent entity sometimes passes it a wrongly-typed join value this would happen and it would be ok to go on, assuming the next doc would have a correct join value? But then again, you could just craft your sql to handle this contingency in these cases. So if Sascha, Alexander or anyone can explain better why this is a good idea, and possibly contribute a unit test then perhaps we can commit this. Otherwise I think its a "won't fix".
        Hoss Man made changes -
        Assignee Noble Paul [ noble.paul ] James Dyer [ jdyer ]
        Fix Version/s 4.0 [ 12322551 ]
        Hide
        Hoss Man added a comment -
        • There is no indication that anyone is actively working on this issue, so removing 4.0 from the fixVersion
        • assigning to James to assess the existing patch for reconsideration
        Show
        Hoss Man added a comment - There is no indication that anyone is actively working on this issue, so removing 4.0 from the fixVersion assigning to James to assess the existing patch for reconsideration
        Robert Muir made changes -
        Fix Version/s 4.0 [ 12322551 ]
        Fix Version/s 4.0-BETA [ 12322455 ]
        Hide
        Robert Muir added a comment -

        rmuir20120906-bulk-40-change

        Show
        Robert Muir added a comment - rmuir20120906-bulk-40-change
        Hoss Man made changes -
        Fix Version/s 4.0 [ 12322455 ]
        Fix Version/s 4.0-ALPHA [ 12314992 ]
        Hide
        Hoss Man added a comment -

        bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment

        Show
        Hoss Man added a comment - bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment
        Robert Muir made changes -
        Fix Version/s 3.6 [ 12319065 ]
        Simon Willnauer made changes -
        Fix Version/s 3.6 [ 12319065 ]
        Fix Version/s 3.5 [ 12317876 ]
        Robert Muir made changes -
        Fix Version/s 3.5 [ 12317876 ]
        Fix Version/s 3.4 [ 12316683 ]
        Hide
        Robert Muir added a comment -

        3.4 -> 3.5

        Show
        Robert Muir added a comment - 3.4 -> 3.5
        Robert Muir made changes -
        Fix Version/s 3.4 [ 12316683 ]
        Fix Version/s 4.0 [ 12314992 ]
        Fix Version/s 3.3 [ 12316471 ]
        Robert Muir made changes -
        Fix Version/s 3.3 [ 12316471 ]
        Fix Version/s 3.2 [ 12316172 ]
        Hide
        Robert Muir added a comment -

        Bulk move 3.2 -> 3.3

        Show
        Robert Muir added a comment - Bulk move 3.2 -> 3.3
        Hoss Man made changes -
        Fix Version/s 3.2 [ 12316172 ]
        Fix Version/s Next [ 12315093 ]
        Hoss Man made changes -
        Fix Version/s Next [ 12315093 ]
        Fix Version/s 1.5 [ 12313566 ]
        Hide
        Hoss Man added a comment -

        Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

        http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

        Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

        A unique token for finding these 240 issues in the future: hossversioncleanup20100527

        Show
        Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
        Alexander Kanarsky made changes -
        Attachment SOLR-1549.patch [ 12444026 ]
        Hide
        Alexander Kanarsky added a comment -

        Attached is the patch for JdbcDataSource to use the onError value properly and handle SQLException thrown when the actual data does not match the column type. This is the patch for DIH version included in Solr 1.4 distribution.

        Show
        Alexander Kanarsky added a comment - Attached is the patch for JdbcDataSource to use the onError value properly and handle SQLException thrown when the actual data does not match the column type. This is the patch for DIH version included in Solr 1.4 distribution.
        Noble Paul made changes -
        Fix Version/s 1.5 [ 12313566 ]
        Noble Paul made changes -
        Field Original Value New Value
        Assignee Noble Paul [ noble.paul ]
        Sascha Szott created issue -

          People

          • Assignee:
            James Dyer
            Reporter:
            Sascha Szott
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development