Details

      Description

      I'm thinking to work on a patch that adds SQL92JDBCDataModel based on the sample code of PostgreSQLJDBCDataModel. This class will only use standard SQL92. It won't be high performance, but would have maximum compatibility to other popular DBMS, such as Oracle, SQLlite, etc. This is quite important to Drupal/Mahout integration.

      Before I actually write the code, is there any suggestions or concerns? Thanks!

      1. SQL92.patch
        19 kB
        Daniel Xiaodan Zhou
      2. SQL92_postgres.patch
        35 kB
        Daniel Xiaodan Zhou

        Activity

        Hide
        Sean Owen added a comment -

        Seems fine to me. Just follow the pattern of the other implementations.

        Show
        Sean Owen added a comment - Seems fine to me. Just follow the pattern of the other implementations.
        Hide
        Daniel Xiaodan Zhou added a comment -

        patch attached. copied most of the code from PostgreSQLJDBCDataModel

        Show
        Daniel Xiaodan Zhou added a comment - patch attached. copied most of the code from PostgreSQLJDBCDataModel
        Hide
        Sean Owen added a comment -

        Yes it's almost completely identical to the Postgres version – this doesn't seem right. Either we can just relax the Postgres implementation to be SQL92 compliant, if it doesn't hurt it much, or this needs to be refactored to encapsulate their commonality. I am not comfortable with that much cloning.

        Show
        Sean Owen added a comment - Yes it's almost completely identical to the Postgres version – this doesn't seem right. Either we can just relax the Postgres implementation to be SQL92 compliant, if it doesn't hurt it much, or this needs to be refactored to encapsulate their commonality. I am not comfortable with that much cloning.
        Hide
        Daniel Xiaodan Zhou added a comment -

        I think so too. The only different between SQL92 implementation and PostgreSQL is in setPreference(): SQL92 uses SELECT to test whether a record exists, and use INSERT or UPDATE accordingly. PostgreSQL simply INSERT first, in case of DUPLICATE KEY catches the exception (with a special exception ID) and UPDATE. Arguably this is a little faster than SQL92.

        My view is to have SQL92 as base class, and PostgreSQL as subclass that only overrides setPreference(). In fact, the PostgreSQL implementation is also quite identical to MySQL implementation too. Further refactoring is to pull up the set of SQL statements as a member method getDefaultSQL() into AbstractJDBCDataModel. This removes duplicate code in MySQL/Postgre/SQL92 too.

        If you think this is fine, I can do the refactoring. Do you think it's OK to check in this patch first for archival purpose?

        Also, I'm not sure whether it's OK to use "@author" in Javadoc. Not so many classes in Taste have that...

        Show
        Daniel Xiaodan Zhou added a comment - I think so too. The only different between SQL92 implementation and PostgreSQL is in setPreference(): SQL92 uses SELECT to test whether a record exists, and use INSERT or UPDATE accordingly. PostgreSQL simply INSERT first, in case of DUPLICATE KEY catches the exception (with a special exception ID) and UPDATE. Arguably this is a little faster than SQL92. My view is to have SQL92 as base class, and PostgreSQL as subclass that only overrides setPreference(). In fact, the PostgreSQL implementation is also quite identical to MySQL implementation too. Further refactoring is to pull up the set of SQL statements as a member method getDefaultSQL() into AbstractJDBCDataModel. This removes duplicate code in MySQL/Postgre/SQL92 too. If you think this is fine, I can do the refactoring. Do you think it's OK to check in this patch first for archival purpose? Also, I'm not sure whether it's OK to use "@author" in Javadoc. Not so many classes in Taste have that...
        Hide
        Sean Owen added a comment -

        I think Apache code style guidelines discourage @author tags on individual files. This policy encourages everyone to view all the code as "theirs".

        The patch is fine, as it is, and is attached here in JIRA "for the record". I don't think it needs to be committed. I assure you I also will make sure this gets in, in the right form.

        And yes if you have time to adjust this patch as we've discussed, that would be great. I think you're welcome to keep it simple: extend the PostgreSQL implementation and override. Don't bother with MySQL. Even though it kind of makes some sense that the "SQL92" implementation is the base version, in practice a SQL92-compliant database that isn't one of the two supported RDBMSes is the rare beast.

        That said, if you really wanted to do a big proper clean refactoring to put the SQL92 implementation as the parent, I wouldn't object if it were nice and tidy. I wouldn't tell you that you need to do this though.

        Show
        Sean Owen added a comment - I think Apache code style guidelines discourage @author tags on individual files. This policy encourages everyone to view all the code as "theirs". The patch is fine, as it is, and is attached here in JIRA "for the record". I don't think it needs to be committed. I assure you I also will make sure this gets in, in the right form. And yes if you have time to adjust this patch as we've discussed, that would be great. I think you're welcome to keep it simple: extend the PostgreSQL implementation and override. Don't bother with MySQL. Even though it kind of makes some sense that the "SQL92" implementation is the base version, in practice a SQL92-compliant database that isn't one of the two supported RDBMSes is the rare beast. That said, if you really wanted to do a big proper clean refactoring to put the SQL92 implementation as the parent, I wouldn't object if it were nice and tidy. I wouldn't tell you that you need to do this though.
        Hide
        Daniel Xiaodan Zhou added a comment -

        I will remove @author tags from SQL92 implementation to follow Mahout standard. For the record, I think the Apache coding guideline (http://portals.apache.org/development/code-standards.html) suggests using the @author tags.

        I agree not to bother with MySQL. Between Postgres and SQL92, I'd rather have SQL92 as base. Because SQL92 is not likely to change in the future, but Postgres might. Especially if you look at PostgreSQLJDBCDataModel.java line 62:

        private static final String POSTGRESQL_DUPLICATE_KEY_STATE = "23505"; // this is brittle...

        This is probably going to change in the future, and might affect SQL92 if SQL92 derives from Postgres...

        On the other hand, the benefit of duplicated code is that it allows some independence. Besides, the only duplication between SQL92 and Postgres is with the constructors (and the fact that they both override setPreference(), though differently). It has some but not much benefit to have Postgres extends SQL92 or vice versa. So I think the current implementation is fine too.

        The reason I'd like to push SQL92 is that Drupal 7 supports Oracle/MSSQL/SQLite (http://drupal.org/requirements). So I need to make sure Mahout can access those databases too for Drupal/Mahout integration.

        What do you think?

        Show
        Daniel Xiaodan Zhou added a comment - I will remove @author tags from SQL92 implementation to follow Mahout standard. For the record, I think the Apache coding guideline ( http://portals.apache.org/development/code-standards.html ) suggests using the @author tags. I agree not to bother with MySQL. Between Postgres and SQL92, I'd rather have SQL92 as base. Because SQL92 is not likely to change in the future, but Postgres might. Especially if you look at PostgreSQLJDBCDataModel.java line 62: private static final String POSTGRESQL_DUPLICATE_KEY_STATE = "23505"; // this is brittle... This is probably going to change in the future, and might affect SQL92 if SQL92 derives from Postgres... On the other hand, the benefit of duplicated code is that it allows some independence. Besides, the only duplication between SQL92 and Postgres is with the constructors (and the fact that they both override setPreference(), though differently). It has some but not much benefit to have Postgres extends SQL92 or vice versa. So I think the current implementation is fine too. The reason I'd like to push SQL92 is that Drupal 7 supports Oracle/MSSQL/SQLite ( http://drupal.org/requirements ). So I need to make sure Mahout can access those databases too for Drupal/Mahout integration. What do you think?
        Hide
        Sean Owen added a comment -

        (Those are just the conventions of one project, Portals, and I don't know that they're standard. Look at what you turn up with "apache @author tag" on Google for instance.)

        Sounds great to have PostgreSQL extend SQL92, please go for it as you see fit.

        Show
        Sean Owen added a comment - (Those are just the conventions of one project, Portals, and I don't know that they're standard. Look at what you turn up with "apache @author tag" on Google for instance.) Sounds great to have PostgreSQL extend SQL92, please go for it as you see fit.
        Hide
        Daniel Xiaodan Zhou added a comment -

        new patch

        Show
        Daniel Xiaodan Zhou added a comment - new patch
        Hide
        Daniel Xiaodan Zhou added a comment -

        new patch attached. thanks for the explanation about @author tags.
        the patch is successfully built in maven, and I have successfully tested it too.

        Show
        Daniel Xiaodan Zhou added a comment - new patch attached. thanks for the explanation about @author tags. the patch is successfully built in maven, and I have successfully tested it too.
        Hide
        Sean Owen added a comment -

        OK, submitted with some minor changes

        Show
        Sean Owen added a comment - OK, submitted with some minor changes
        Hide
        Daniel Xiaodan Zhou added a comment -

        thanks!

        Show
        Daniel Xiaodan Zhou added a comment - thanks!

          People

          • Assignee:
            Sean Owen
            Reporter:
            Daniel Xiaodan Zhou
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development