OFBiz
  1. OFBiz
  2. OFBIZ-4535

Search using Russian word (maybe others) causes distortion and failed product search

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: Release Branch 12.04, Trunk
    • Fix Version/s: None
    • Labels:
      None
    • Environment:

      Ubuntu and others.

      Description

      NOTE: The following post contains UTF8 characters.

      After rebuilding keywords in a UTF8 database (postgres) and searching a UTF8 browser (chrome), the Russian phrase is correctly present in the database, as follows:

      ofbiz=# select * from product_keyword where keyword = 'игроков';
      product_id | keyword | relevancy_weight (trimmed)
      ---------------------------------------
      DVDMV-ADVANGYM | игроков | 1

      However, when pasting 'игроков' into ecommerce Search, the following is returned "Not Found"

      Keywords: "Ð¸Ð³Ñ Ð¾ÐºÐ¾Ð²", where any word matches, which distorts the search.

      This may affect other languages, which I haven't tested.

      I also tried the same search in 12.04 (demo) and trunk and it produces the same issue. This means that multi-language product search is broken in OFBiz. 11.04 is unaffected.

        Issue Links

          Activity

          mz4wheeler created issue -
          Hide
          mz4wheeler added a comment -

          Happens in other languages as well.

          Show
          mz4wheeler added a comment - Happens in other languages as well.
          Hide
          mz4wheeler added a comment -

          OK: I discovered that this used to work with the original 9.04. During the recent updates, this UTF8 search broke.

          Show
          mz4wheeler added a comment - OK: I discovered that this used to work with the original 9.04. During the recent updates, this UTF8 search broke.
          Hide
          mz4wheeler added a comment -

          I loaded the stock 11.04 branch, and this problem does NOT occur.

          Show
          mz4wheeler added a comment - I loaded the stock 11.04 branch, and this problem does NOT occur.
          Hide
          Jacques Le Roux added a comment -

          Do you reproduce this on trunk and stable demos?

          Show
          Jacques Le Roux added a comment - Do you reproduce this on trunk and stable demos?
          Hide
          mz4wheeler added a comment -

          Hey Jacques. Try posting 'игроков' into the ecommerce search of the current trunk. It returns 'Ð¸Ð³Ñ Ð¾ÐºÐ¾Ð²' not found. If you post it to the 9.04 search, or branch 11.04, it returns 'игроков' not found.

          Show
          mz4wheeler added a comment - Hey Jacques. Try posting 'игроков' into the ecommerce search of the current trunk. It returns 'Ð¸Ð³Ñ Ð¾ÐºÐ¾Ð²' not found. If you post it to the 9.04 search, or branch 11.04, it returns 'игроков' not found.
          Hide
          Uwe Allner added a comment -

          Yes, it occurs on trunk and demos as well. And not only in the search, but also e.g. when creating an address in the ecommerce application with german umlaut characters. It seems to me there is a general problem with encoding/decoding, even if the parameters are transported in the body of a POST request.
          I checked all known means of fixing this: setting the SetCharacterEncodingFilter in the web.xml of my application (to UTF-8); the web page is UTF-8 encoded, the tomcat connector is set to UTF-8, the request is marked as UTF-8 and also the response comes as UTF-8.
          Perhaps somewhere a double encoding is done; but I wasn't able to find it nor changing the behaviour of OFBiz in this regard in any way... :o(

          Show
          Uwe Allner added a comment - Yes, it occurs on trunk and demos as well. And not only in the search, but also e.g. when creating an address in the ecommerce application with german umlaut characters. It seems to me there is a general problem with encoding/decoding, even if the parameters are transported in the body of a POST request. I checked all known means of fixing this: setting the SetCharacterEncodingFilter in the web.xml of my application (to UTF-8); the web page is UTF-8 encoded, the tomcat connector is set to UTF-8, the request is marked as UTF-8 and also the response comes as UTF-8. Perhaps somewhere a double encoding is done; but I wasn't able to find it nor changing the behaviour of OFBiz in this regard in any way... :o(
          Hide
          Jacques Le Roux added a comment - - edited

          ==== EDIT ====
          Hi Uwe,

          Thanks for your help. From description it seems only Release 09.04.01 and trunk are touched. This could help to track the issue. Because it works on 09.04 and the culprit commit must be between (also 11.04 work(s/d) this could foster tracking). For now it's not a priority for me, sorry...

          Show
          Jacques Le Roux added a comment - - edited ==== EDIT ==== Hi Uwe, Thanks for your help. From description it seems only Release 09.04.01 and trunk are touched. This could help to track the issue. Because it works on 09.04 and the culprit commit must be between (also 11.04 work(s/d) this could foster tracking). For now it's not a priority for me, sorry...
          mz4wheeler made changes -
          Field Original Value New Value
          Affects Version/s Release Branch 12.04 [ 12321265 ]
          Affects Version/s Release 09.04.01 [ 12316422 ]
          Hide
          mz4wheeler added a comment -

          This shows the commit diff in detail.

          Show
          mz4wheeler added a comment - This shows the commit diff in detail.
          mz4wheeler made changes -
          Attachment r1127449-r1127394.diff [ 12583241 ]
          Hide
          mz4wheeler added a comment - - edited

          OK: I managed to track this down. The problem first showed up in r1127449, committed by Hans:

          r1127449 | hansbak | 2011-05-25 02:25:16 -0700 (Wed, 25 May 2011) | 1
          Changed paths:
          M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/cart/minicart.ftl
          M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/catalog/breadcrumbs.ftl
          M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/catalog/categorydetail.ftl
          M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/catalog/compareproducts.ftl
          M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/catalog/productsummary.ftl
          M /ofbiz/trunk/applications/product/src/org/ofbiz/product/category/CatalogUrlFilter.java
          M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoConfigurator.xml
          M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoFinAccount.xml
          M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoPopularCategoriesData.xml
          M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoProduct.xml
          M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoPurchasing.xml
          M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecomclone/WEB-INF/web.xml
          M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/WEB-INF/web.xml
          M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/cart/showcart.ftl
          M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/ShowBestSellingCategory.ftl
          M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/minilastviewedcategories.ftl
          M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/miniproductsummary.ftl
          M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/productdetail.ftl
          M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/sidedeepcategory.ftl

          this change will introduce seo friendly urls for products and categories which are generated from the description. It also allows to convert url's from previous systems to point to the product in the current system. In the ecommerce demo records have been added to show this feature.
          ------------------------------------------------------------------------

          svn switch -r1127449 http://svn.apache.org/repos/asf/ofbiz/branches/release12.04 (produces ERROR)
          svn switch -r1127394 http://svn.apache.org/repos/asf/ofbiz/branches/release12.04 (GOOD)

          All versions of 11.04 are good.

          I looked over the diff (attached file) and it isn't obvious to me how this is affects searching with a UTF8 word. The same issue is present in both the standard/advanced ecommerce search, 12.04/trunk, but the back-end search is not affected. Maybe Hans can look it over.

          Show
          mz4wheeler added a comment - - edited OK: I managed to track this down. The problem first showed up in r1127449, committed by Hans: r1127449 | hansbak | 2011-05-25 02:25:16 -0700 (Wed, 25 May 2011) | 1 Changed paths: M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/cart/minicart.ftl M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/catalog/breadcrumbs.ftl M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/catalog/categorydetail.ftl M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/catalog/compareproducts.ftl M /ofbiz/trunk/applications/order/webapp/ordermgr/entry/catalog/productsummary.ftl M /ofbiz/trunk/applications/product/src/org/ofbiz/product/category/CatalogUrlFilter.java M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoConfigurator.xml M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoFinAccount.xml M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoPopularCategoriesData.xml M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoProduct.xml M /ofbiz/trunk/specialpurpose/ecommerce/data/DemoPurchasing.xml M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecomclone/WEB-INF/web.xml M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/WEB-INF/web.xml M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/cart/showcart.ftl M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/ShowBestSellingCategory.ftl M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/minilastviewedcategories.ftl M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/miniproductsummary.ftl M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/productdetail.ftl M /ofbiz/trunk/specialpurpose/ecommerce/webapp/ecommerce/catalog/sidedeepcategory.ftl this change will introduce seo friendly urls for products and categories which are generated from the description. It also allows to convert url's from previous systems to point to the product in the current system. In the ecommerce demo records have been added to show this feature. ------------------------------------------------------------------------ svn switch -r1127449 http://svn.apache.org/repos/asf/ofbiz/branches/release12.04 (produces ERROR) svn switch -r1127394 http://svn.apache.org/repos/asf/ofbiz/branches/release12.04 ( GOOD ) All versions of 11.04 are good. I looked over the diff (attached file) and it isn't obvious to me how this is affects searching with a UTF8 word. The same issue is present in both the standard/advanced ecommerce search, 12.04/trunk, but the back-end search is not affected. Maybe Hans can look it over.
          mz4wheeler made changes -
          Description NOTE: The following post contains UTF8 characters.

          After rebuilding keywords in a UTF8 database (postgres) and searching a UTF8 browser (chrome), the Russian phrase is correctly present in the database, as follows:


          ofbiz=# select * from product_keyword where keyword = 'игроков';
             product_id | keyword | relevancy_weight (trimmed)
          -----------------+---------+-----------------
           DVDMV-ADVANGYM | игроков | 1

          However, when pasting 'игроков' into ecommerce Search, the following is returned "Not Found"

          Keywords: "Ð¸Ð³Ñ Ð¾ÐºÐ¾Ð²", where any word matches, which distorts the search.

          This may affect other languages, which I haven't tested.

          I also tried the same search in 9.04 (demo) and trunk and it produces the same issue. This means that multi-language product search is broken in OFBiz.


          NOTE: The following post contains UTF8 characters.

          After rebuilding keywords in a UTF8 database (postgres) and searching a UTF8 browser (chrome), the Russian phrase is correctly present in the database, as follows:


          ofbiz=# select * from product_keyword where keyword = 'игроков';
             product_id | keyword | relevancy_weight (trimmed)
          -----------------+---------+-----------------
           DVDMV-ADVANGYM | игроков | 1

          However, when pasting 'игроков' into ecommerce Search, the following is returned "Not Found"

          Keywords: "Ð¸Ð³Ñ Ð¾ÐºÐ¾Ð²", where any word matches, which distorts the search.

          This may affect other languages, which I haven't tested.

          I also tried the same search in 12.04 (demo) and trunk and it produces the same issue. This means that multi-language product search is broken in OFBiz. 11.04 is unaffected.


          Hide
          Hans Bakker added a comment -

          when we can find some time, we will have a look at it......

          Show
          Hans Bakker added a comment - when we can find some time, we will have a look at it......
          Hide
          mz4wheeler added a comment -

          If you take these out of:

          specialpurpose/ecommerce/webapp/ecommerce/WEB-INF/web.xml
          specialpurpose/ecommerce/webapp/ecomclone/WEB-INF/web.xml

          + <filter-name>CatalogUrlFilter</filter-name>
          + <display-name>CatalogUrlFilter</display-name>
          + <filter-class>org.ofbiz.product.category.CatalogUrlFilter</filter-class>
          + <init-param><param-name>defaultLocaleString</param-name><param-value>en_US</param-value></init-param>
          + <init-param><param-name>redirectUrl</param-name><param-value>/control/main</param-value></init-param>
          + </filter>
          ...
          <filter-mapping>
          + <filter-name>CatalogUrlFilter</filter-name>
          + <url-pattern>/*</url-pattern>
          + </filter-mapping>

          The search returns normal.

          Show
          mz4wheeler added a comment - If you take these out of: specialpurpose/ecommerce/webapp/ecommerce/WEB-INF/web.xml specialpurpose/ecommerce/webapp/ecomclone/WEB-INF/web.xml + <filter-name>CatalogUrlFilter</filter-name> + <display-name>CatalogUrlFilter</display-name> + <filter-class>org.ofbiz.product.category.CatalogUrlFilter</filter-class> + <init-param><param-name>defaultLocaleString</param-name><param-value>en_US</param-value></init-param> + <init-param><param-name>redirectUrl</param-name><param-value>/control/main</param-value></init-param> + </filter> ... <filter-mapping> + <filter-name>CatalogUrlFilter</filter-name> + <url-pattern>/*</url-pattern> + </filter-mapping> The search returns normal.
          Saroj Khamlue made changes -
          Comment [ I tried take these out of :

          <filter-name>CatalogUrlFilter</filter-name>
              <display-name>CatalogUrlFilter</display-name>
              <filter-class>org.ofbiz.product.category.CatalogUrlFilter</filter-class>
              <init-param><param-name>defaultLocaleString</param-name><param-value>en_US</param-value></init-param>
              <init-param><param-name>redirectUrl</param-name><param-value>/control/main</param-value></init-param>
          </filter>

          ...

          <filter-mapping>
              <filter-name>CatalogUrlFilter</filter-name>
              <url-pattern>/*</url-pattern>
          </filter-mapping>

          The search with multi-languages works fine.

          Please help me to confirm we can take these out and will not affect.
          ]
          Hide
          mz4wheeler added a comment -

          Hey Saroj. Thanks for looking at this. I don't think it's as simple as just taking out those lines. It does make the UTF8 search work, but it probably breaks the SEO-friendly catalog URLs (just a guess), which was the original purpose of r1127449.

          Show
          mz4wheeler added a comment - Hey Saroj. Thanks for looking at this. I don't think it's as simple as just taking out those lines. It does make the UTF8 search work, but it probably breaks the SEO-friendly catalog URLs (just a guess), which was the original purpose of r1127449.
          Jacques Le Roux made changes -
          Link This issue is related to OFBIZ-5312 [ OFBIZ-5312 ]
          Hide
          Jacques Le Roux added a comment -

          For SEO-Friendly URLs, I think we should revert what we have currently (introduced with r1127449 and sequel/related) and rather use OFBIZ-5312

          Show
          Jacques Le Roux added a comment - For SEO-Friendly URLs, I think we should revert what we have currently (introduced with r1127449 and sequel/related) and rather use OFBIZ-5312
          Hide
          Uwe Allner added a comment -

          Sehr geehrte Damen und Herren,

          ich bin vom 09.08.2013 bis 08.10.2013 in Elternzeit. In dringenden Fällen wenden Sie sich bitte an Herrn Markus May (m.may@mvb-online.de).

          Show
          Uwe Allner added a comment - Sehr geehrte Damen und Herren, ich bin vom 09.08.2013 bis 08.10.2013 in Elternzeit. In dringenden Fällen wenden Sie sich bitte an Herrn Markus May (m.may@mvb-online.de).
          Hide
          Sebastian Wachinger added a comment - - edited

          As pointed out by mz4wheeler, just taking out the CatalogUrlFilter does not solve the problem, specially when you are already using it in the shop.

          After I did not succeed in manipulating web.xml accordingly (i.e. performing some url-pattern and allowedPaths wizardry), I inserted those four lines into CatalogUrlFilter.java which solved this problem for me, along with the bug described in OFBIZ-2837:

          CatalogUrlFilter.java
          --- applications/product/src/org/ofbiz/product/category/CatalogUrlFilter.java	(revision 1529578)
          +++ applications/product/src/org/ofbiz/product/category/CatalogUrlFilter.java	(working copy)
          @@ -80,7 +80,12 @@
                   if (UtilValidate.isNotEmpty(pathInfo)) {
                       List<String> pathElements = StringUtil.split(pathInfo, "/");
                       String alternativeUrl = pathElements.get(0);
                       
          +            if (alternativeUrl.startsWith("control")) {
          +                chain.doFilter(request, response); // Just continue chain.
          +                return;
          +            }
          +            
                       String productId = null;
                       String productCategoryId = null;
                       String urlContentId = null;
          

          While this works for now, somehow it does not deal with the underlying cause, so I would be interested in hearing what the expert's opinion here might be!
          After all having customers' names and addresses encoded correctly is a must when targeting a global audience.

          On a related issue, I would like to know if there are any drawbacks in the solution Paul proposed in Proposal-URL-Generation-Changes (as quoted in OFBIZ-5312), compared to the current implementation of CatalogUrlFilter.
          Though I'm glad that CatalogUrlFilter exists and happily use it, Paul points at some important issues there.

          Show
          Sebastian Wachinger added a comment - - edited As pointed out by mz4wheeler, just taking out the CatalogUrlFilter does not solve the problem, specially when you are already using it in the shop. After I did not succeed in manipulating web.xml accordingly (i.e. performing some url-pattern and allowedPaths wizardry), I inserted those four lines into CatalogUrlFilter.java which solved this problem for me, along with the bug described in OFBIZ-2837 : CatalogUrlFilter.java --- applications/product/src/org/ofbiz/product/category/CatalogUrlFilter.java (revision 1529578) +++ applications/product/src/org/ofbiz/product/category/CatalogUrlFilter.java (working copy) @@ -80,7 +80,12 @@ if (UtilValidate.isNotEmpty(pathInfo)) { List< String > pathElements = StringUtil.split(pathInfo, "/" ); String alternativeUrl = pathElements.get(0); + if (alternativeUrl.startsWith( "control" )) { + chain.doFilter(request, response); // Just continue chain. + return ; + } + String productId = null ; String productCategoryId = null ; String urlContentId = null ; While this works for now, somehow it does not deal with the underlying cause, so I would be interested in hearing what the expert's opinion here might be! After all having customers' names and addresses encoded correctly is a must when targeting a global audience. On a related issue, I would like to know if there are any drawbacks in the solution Paul proposed in Proposal-URL-Generation-Changes (as quoted in OFBIZ-5312 ), compared to the current implementation of CatalogUrlFilter . Though I'm glad that CatalogUrlFilter exists and happily use it, Paul points at some important issues there.
          Jacques Le Roux made changes -
          Link This issue is related to OFBIZ-2837 [ OFBIZ-2837 ]
          Hide
          Paul Piper added a comment -

          I am not entirely convinced that this is related to the Filter, though I could be wrong. The underlying issue looks like an encoding issue to me, switching from UTF-8 to an ISO based format or vice versa. I am not sure that this is actually happening within the CatalogUrlFilter, but I will have to look into it myself... On a first guess, i would look into the keywordsearch.groovy file where a char conversion is more likely to be found.

          @sebastian: the drawbacks are mainly related to some hefty changes in a future OFBiz version. Even the patch we provided isn't 100% spot on, since we tried to introduce changes without reworking all of the original URL generation code. Though it can be used, ideally it should be fully integrated into OFBiz at some point.

          You see, the underlying issue with OFBiz URL generation is that there is an assumption of pages being rendered through other means than /control. Though this is never the case, it remains the reason why we misuse the index.jsp file to do redirects to /control and handle it with a controlservlet in there. To be fair, this is common practice among web application development, but a practice that leads to undesired effects from a URL or even customer point of view. The ootb version of OFBiz unfortunately only works around the issue, which is why a lot of redirects take place and yes, also make the same content available under 2 different urls. Both are dangerous when it comes to SEO. The fix would be to map the controlservlet to / and introduce proper filters to handle the auto-generated URLS (product, catalog).

          Show
          Paul Piper added a comment - I am not entirely convinced that this is related to the Filter, though I could be wrong. The underlying issue looks like an encoding issue to me, switching from UTF-8 to an ISO based format or vice versa. I am not sure that this is actually happening within the CatalogUrlFilter, but I will have to look into it myself... On a first guess, i would look into the keywordsearch.groovy file where a char conversion is more likely to be found. @sebastian: the drawbacks are mainly related to some hefty changes in a future OFBiz version. Even the patch we provided isn't 100% spot on, since we tried to introduce changes without reworking all of the original URL generation code. Though it can be used, ideally it should be fully integrated into OFBiz at some point. You see, the underlying issue with OFBiz URL generation is that there is an assumption of pages being rendered through other means than /control. Though this is never the case, it remains the reason why we misuse the index.jsp file to do redirects to /control and handle it with a controlservlet in there. To be fair, this is common practice among web application development, but a practice that leads to undesired effects from a URL or even customer point of view. The ootb version of OFBiz unfortunately only works around the issue, which is why a lot of redirects take place and yes, also make the same content available under 2 different urls. Both are dangerous when it comes to SEO. The fix would be to map the controlservlet to / and introduce proper filters to handle the auto-generated URLS (product, catalog).
          Hide
          Sebastian Wachinger added a comment -

          The presence of CatalogUrlFilter somehow manages to scramble the input of all the forms in the ecommerce app, i.e. customer name and addresses, EFT account details, search etc.; since I partially bypassed CatalogUrlFilter with the quick and dirty hack in my last comment, all those forms do work again for the whole UTF-8 encoded Unicode character set. So if keywordsearch.groovy is the culprit, it might not be the only one.

          @paul: Thanks a lot for the input on the patch in OFBIZ-5312, it's highly appreciated! I noticed that over there things start to happen now, so this definitely is an idea whose time has come!
          Once we have this in OFBiz, the issue of this here thread and others related to it will be gone.

          Show
          Sebastian Wachinger added a comment - The presence of CatalogUrlFilter somehow manages to scramble the input of all the forms in the ecommerce app, i.e. customer name and addresses, EFT account details, search etc.; since I partially bypassed CatalogUrlFilter with the quick and dirty hack in my last comment, all those forms do work again for the whole UTF-8 encoded Unicode character set. So if keywordsearch.groovy is the culprit, it might not be the only one. @paul: Thanks a lot for the input on the patch in OFBIZ-5312 , it's highly appreciated! I noticed that over there things start to happen now, so this definitely is an idea whose time has come ! Once we have this in OFBiz, the issue of this here thread and others related to it will be gone.
          Jacques Le Roux made changes -
          Link This issue depends upon OFBIZ-5312 [ OFBIZ-5312 ]
          Hide
          Jacques Le Roux added a comment -

          Thanks for comment Sebastian,

          I made the issues currently "related" to OFBIZ-5312 also "depended upon by". So that we will check and hopefully close them when we will (hopefully) commit OFBIZ-5312. Note that there is also OFBIZ-5030.

          Hi Paul,
          Could you elaborate on

          Even the patch we provided isn't 100% spot on, since we tried to introduce changes without reworking all of the original URL generation code.

          Does it mean that there are the code to do it but it was not used, or this code is still missing in the patch (I did not have the time to review all yet)

          Show
          Jacques Le Roux added a comment - Thanks for comment Sebastian, I made the issues currently "related" to OFBIZ-5312 also "depended upon by" . So that we will check and hopefully close them when we will (hopefully) commit OFBIZ-5312 . Note that there is also OFBIZ-5030 . Hi Paul, Could you elaborate on Even the patch we provided isn't 100% spot on, since we tried to introduce changes without reworking all of the original URL generation code. Does it mean that there are the code to do it but it was not used, or this code is still missing in the patch (I did not have the time to review all yet)
          Hide
          Jacques Le Roux added a comment -

          Hi Sebastian,

          Could you try to make searches with special characters using the new branch https://svn.apache.org/repos/asf/ofbiz/branches/OFBIZ-5312-ofbiz-ecommerce-seo-2013-10-23 ?

          Show
          Jacques Le Roux added a comment - Hi Sebastian, Could you try to make searches with special characters using the new branch https://svn.apache.org/repos/asf/ofbiz/branches/OFBIZ-5312-ofbiz-ecommerce-seo-2013-10-23 ?
          Hide
          Sebastian Wachinger added a comment - - edited

          Hi Jacques,

          I tested the new seo-branch, and as expected there are no issues with the processing of special characters in the search or in the other input fields for customer data, while ALTERNATIVE URL still seems to work.

          @paul: Maybe I should place this question on OFBIZ-5312, but then again it's possible that you already answered it in your comment from 15 October (the not-100%-spot-on-not-yet-fully-integrated bit) in this here thread, so there we go:
          While testing the seo-branch I noticed that now many different variants for the URL of a product page do work,
          e.g. for ecommerce/product/Tiny-Gizmo-gz-1000.html

          • ecommerce/product/Tiny-Gizmo-GZ-1000.html
          • ecommerce/product/tiny-gizmo-gz-1000.html (and all conceivable combinations of upper/lower case)
          • ecommerce/SomeArbitraryCharacters/Tiny-Gizmo-gz-1000.html
          • ecommerce/product/tiny-gismo-gz-1000.html (and upper case variants, the gismo may work because of the intended coexistence with ALTERNATIVE_URL you mentioned on OFBIZ-5312)

          So upper and lower case do not matter, and the product bit can be used or not or be replaced at will. Now I'm wondering if this is a bug or a feature, and whether this has any relevance SEO-wise (can you call this duplicate entries) ?

          Show
          Sebastian Wachinger added a comment - - edited Hi Jacques, I tested the new seo-branch , and as expected there are no issues with the processing of special characters in the search or in the other input fields for customer data, while ALTERNATIVE URL still seems to work. @paul: Maybe I should place this question on OFBIZ-5312 , but then again it's possible that you already answered it in your comment from 15 October (the not-100%-spot-on-not-yet-fully-integrated bit) in this here thread, so there we go: While testing the seo-branch I noticed that now many different variants for the URL of a product page do work, e.g. for ecommerce/product/Tiny-Gizmo-gz-1000.html ecommerce/product/Tiny-Gizmo-GZ-1000.html ecommerce/product/tiny-gizmo-gz-1000.html (and all conceivable combinations of upper/lower case) ecommerce/ SomeArbitraryCharacters /Tiny-Gizmo-gz-1000.html ecommerce/product/tiny-gismo-gz-1000.html (and upper case variants, the gismo may work because of the intended coexistence with ALTERNATIVE_URL you mentioned on OFBIZ-5312 ) So upper and lower case do not matter, and the product bit can be used or not or be replaced at will. Now I'm wondering if this is a bug or a feature, and whether this has any relevance SEO-wise (can you call this duplicate entries ) ?
          Hide
          Jacques Le Roux added a comment -

          HI Sebastian,

          Indeed this question has already been asked by Parimal in OFBIZ-5312, and I answered at https://issues.apache.org/jira/browse/OFBIZ-5312?focusedCommentId=13808947. More in OFBIZ-5312 itself, after your question there.

          Show
          Jacques Le Roux added a comment - HI Sebastian, Indeed this question has already been asked by Parimal in OFBIZ-5312 , and I answered at https://issues.apache.org/jira/browse/OFBIZ-5312?focusedCommentId=13808947 . More in OFBIZ-5312 itself, after your question there.
          Jacques Le Roux made changes -
          Link This issue is related to OFBIZ-5312 [ OFBIZ-5312 ]
          Hide
          Paul Piper added a comment - - edited

          Hi Sebastian,

          that is indeed less than ideal. It probably derives from the fact that only the productId/categoryId is evaluated in the url structure, and not the rest of the string. It probably isn't "too dangerous" if the generated URL is kept the same, but it is risky and not ideal to say the least. If we do not want to implement a url database on an entity level, i would propose to build a simple filter that:

          1. Looks up request URL
          2. Identifies productid/categoryid/catalogid/contentid in the string
          3. Uses the generator to generate a new request url
          4. matches both, the generated string and the request url and does a request redirect if they don't match...

          Show
          Paul Piper added a comment - - edited Hi Sebastian, that is indeed less than ideal. It probably derives from the fact that only the productId/categoryId is evaluated in the url structure, and not the rest of the string. It probably isn't "too dangerous" if the generated URL is kept the same, but it is risky and not ideal to say the least. If we do not want to implement a url database on an entity level, i would propose to build a simple filter that: 1. Looks up request URL 2. Identifies productid/categoryid/catalogid/contentid in the string 3. Uses the generator to generate a new request url 4. matches both, the generated string and the request url and does a request redirect if they don't match...
          Hide
          Jacques Le Roux added a comment -

          We can close this issue now that I commited a change in the SEO branch to differentiate the upper and lower case. I have still to handle the non ASCII character in products names (like çà, whic does not exist in French ). This is reported in OFBIZ-5312 anyway

          Show
          Jacques Le Roux added a comment - We can close this issue now that I commited a change in the SEO branch to differentiate the upper and lower case. I have still to handle the non ASCII character in products names (like çà, whic does not exist in French ). This is reported in OFBIZ-5312 anyway
          Jacques Le Roux made changes -
          Status Open [ 1 ] Closed [ 6 ]
          Assignee Jacques Le Roux [ jacques.le.roux ]
          Resolution Not A Problem [ 8 ]

            People

            • Assignee:
              Jacques Le Roux
              Reporter:
              mz4wheeler
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development