Solr
  1. Solr
  2. SOLR-4464

DIH - Processed documents counter resets to zero after first database request

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.1
    • Fix Version/s: 5.1, 6.0
    • Labels:
    • Environment:

      CentOS 6.3 x64 / apache-tomcat-7.0.35 / mysql-connector-java-5.1.23 - Large machine 5TB of drives and 280GB RAM - Java Heap set to 250Gb - resources are not an issue.

      Description

      [11:20] <quasimotoca> Solr 4.1 - Processed documents resets to 0 after processing my first entity - all database schemas are identical
      [11:21] <quasimotoca> However, all the documents get fetched and I can query the results no problem.

      Here's a link to a screenshot - http://findocs/gridworkz.com/solr

      Everything works perfect except the screen doesn't increment the "Processed" counter on subsequent database Requests.

      1. 20130921solrzerocounter.png
        181 kB
        Aaron Greenspan
      2. 20130921solrzerocounter2.png
        183 kB
        Aaron Greenspan

        Issue Links

          Activity

          Hide
          Greg Bowyer added a comment -

          There is a good chance that a 250GB heap is the root cause of your problems, can you lower it to 16 or 32gb as a start and then see if this problem persists ?

          Show
          Greg Bowyer added a comment - There is a good chance that a 250GB heap is the root cause of your problems, can you lower it to 16 or 32gb as a start and then see if this problem persists ?
          Hide
          Greg Bowyer added a comment -

          ..... I should read bug reports more carefully, everything else is working fine so maybe the heap size is not the issue (I would still lower it however)

          Show
          Greg Bowyer added a comment - ..... I should read bug reports more carefully, everything else is working fine so maybe the heap size is not the issue (I would still lower it however)
          Hide
          Dave Cook added a comment -

          Hi Greg:

          I reset to 32Gb heap - as you can see Processed: looks fine on the first
          Request, it'll take about 45min to hit the second request.

          http://76.72.160.178:8080/solr/#/zev/dataimport//dataimport

          Cheers,
          Dave

          Show
          Dave Cook added a comment - Hi Greg: I reset to 32Gb heap - as you can see Processed: looks fine on the first Request, it'll take about 45min to hit the second request. http://76.72.160.178:8080/solr/#/zev/dataimport//dataimport Cheers, Dave
          Hide
          Dave Cook added a comment -

          Hi Greg:

          I just reset back to zero on the second request:
          http://76.72.160.178:8080/solr/#/zev/dataimport//dataimport

          Cheers,
          Dave

          Show
          Dave Cook added a comment - Hi Greg: I just reset back to zero on the second request: http://76.72.160.178:8080/solr/#/zev/dataimport//dataimport Cheers, Dave
          Hide
          Shawn Heisey added a comment -

          I'm with the user in #solr. The "Total Documents Processed" field in the raw DIH output appears to go missing when it switches to the second request. It's not there at all.

          Show
          Shawn Heisey added a comment - I'm with the user in #solr. The "Total Documents Processed" field in the raw DIH output appears to go missing when it switches to the second request. It's not there at all.
          Hide
          Dave Cook added a comment -

          Hi Shawn:

          Yes, that's correct. It's causing the counter up top to zero out...

          Cheers,
          Dave

          Show
          Dave Cook added a comment - Hi Shawn: Yes, that's correct. It's causing the counter up top to zero out... Cheers, Dave
          Hide
          Dave Cook added a comment -

          Hi Shawn:

          We're still going, however the physical memory is maxed out. Is that normal?

          http://76.72.160.178:8080/solr/#/

          Cheers,
          Dave

          Show
          Dave Cook added a comment - Hi Shawn: We're still going, however the physical memory is maxed out. Is that normal? http://76.72.160.178:8080/solr/#/ Cheers, Dave
          Hide
          Shawn Heisey added a comment -

          This is most likely due to basic operating system design. It's normal for all modern operating systems to utilize all physical memory. The memory that is not used for programs gets used by the OS to cache data on the disk for performance reasons. If a program or the OS requests additional memory, the OS will happily and instantly give up the lowest priority cache data to satisfy the memory request.

          Your Solr admin page seems to be locked up while trying to load the dashboard, so I can't see the actual numbers. I hope everything is OK.

          Show
          Shawn Heisey added a comment - This is most likely due to basic operating system design. It's normal for all modern operating systems to utilize all physical memory. The memory that is not used for programs gets used by the OS to cache data on the disk for performance reasons. If a program or the OS requests additional memory, the OS will happily and instantly give up the lowest priority cache data to satisfy the memory request. Your Solr admin page seems to be locked up while trying to load the dashboard, so I can't see the actual numbers. I hope everything is OK.
          Hide
          Shawn Heisey added a comment -

          I haven't checked to see if this is still a problem, but since no action has been taken, it probably is still a problem. Is there any reasonable way to fix it?

          This comment is part of an effort to close old issues that I have reported. Search tag: elyograg2013springclean

          Show
          Shawn Heisey added a comment - I haven't checked to see if this is still a problem, but since no action has been taken, it probably is still a problem. Is there any reasonable way to fix it? This comment is part of an effort to close old issues that I have reported. Search tag: elyograg2013springclean
          Hide
          Dave Cook added a comment -

          Hi Shawn:

          Nothing yet. Not a rush but I'm sure other folks will run into it.

          Cheers,
          Dave

          Show
          Dave Cook added a comment - Hi Shawn: Nothing yet. Not a rush but I'm sure other folks will run into it. Cheers, Dave
          Hide
          Aaron Greenspan added a comment - - edited

          I am running CentOS 6.4 (64-bit) with Solr 4.4.0, MySQL 5.5 and MySQL Connector for Java 5.1.25. I imported one core (full, not delta) over a few hours with no problem. Then I moved onto a second core I wanted to import, which appears to be working except that the "Processed" number is staying at zero. There are no Solr errors and MySQL seems to have no issue with the queries. A screenshot of the counter at zero after two+ hours of error-free processing is attached, though I can't seem to find where it went on JIRA (now I see, it's under "Attachments" above the comments).

          This may be a fairly common problem. See http://stackoverflow.com/questions/8616545/total-documents-processed-0-though-total-rows-fetched-is-non-zero-using-solr-w for a possibly related set of issues.

          Show
          Aaron Greenspan added a comment - - edited I am running CentOS 6.4 (64-bit) with Solr 4.4.0, MySQL 5.5 and MySQL Connector for Java 5.1.25. I imported one core (full, not delta) over a few hours with no problem. Then I moved onto a second core I wanted to import, which appears to be working except that the "Processed" number is staying at zero. There are no Solr errors and MySQL seems to have no issue with the queries. A screenshot of the counter at zero after two+ hours of error-free processing is attached, though I can't seem to find where it went on JIRA (now I see, it's under "Attachments" above the comments). This may be a fairly common problem. See http://stackoverflow.com/questions/8616545/total-documents-processed-0-though-total-rows-fetched-is-non-zero-using-solr-w for a possibly related set of issues.
          Hide
          Aaron Greenspan added a comment -

          I attached another screenshot of 4.4.0 showing that in fact 3 million documents were processed in the end, even though it said 0 throughout the entire time.

          Show
          Aaron Greenspan added a comment - I attached another screenshot of 4.4.0 showing that in fact 3 million documents were processed in the end, even though it said 0 throughout the entire time.
          Hide
          Thomas Champagne added a comment -

          In solr 4.9, this problem is due to the line 230 in the DocBuilder class :

          DocBuilder.java at line 230
          statusMessages.remove(DataImporter.MSG.TOTAL_DOC_PROCESSED);
          

          After each entity that is processed, the status message about document processed is removed. I don't understand why.

          Show
          Thomas Champagne added a comment - In solr 4.9, this problem is due to the line 230 in the DocBuilder class : DocBuilder.java at line 230 statusMessages.remove(DataImporter.MSG.TOTAL_DOC_PROCESSED); After each entity that is processed, the status message about document processed is removed. I don't understand why.
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Thomas for pointing out the offending line. So, this happens when the data-config.xml has more than one entity. After the first entity is fully processed, the total doc processed is removed from the response and not added again until the import is complete. I'll fix by removing the offending line.

          Show
          Shalin Shekhar Mangar added a comment - Thanks Thomas for pointing out the offending line. So, this happens when the data-config.xml has more than one entity. After the first entity is fully processed, the total doc processed is removed from the response and not added again until the import is complete. I'll fix by removing the offending line.
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks everyone. The fix will be released in 5.1

          Show
          Shalin Shekhar Mangar added a comment - Thanks everyone. The fix will be released in 5.1
          Hide
          ASF subversion and git services added a comment -

          Commit 1665110 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1665110 ]

          SOLR-4464: DIH Processed documents counter resets to zero after first entity is processed

          Show
          ASF subversion and git services added a comment - Commit 1665110 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1665110 ] SOLR-4464 : DIH Processed documents counter resets to zero after first entity is processed
          Hide
          ASF subversion and git services added a comment -

          Commit 1665111 from shalin@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1665111 ]

          SOLR-4464: DIH Processed documents counter resets to zero after first entity is processed

          Show
          ASF subversion and git services added a comment - Commit 1665111 from shalin@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1665111 ] SOLR-4464 : DIH Processed documents counter resets to zero after first entity is processed
          Hide
          Timothy Potter added a comment -

          Bulk close after 5.1 release

          Show
          Timothy Potter added a comment - Bulk close after 5.1 release

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Dave Cook
            • Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development