Nutch
  1. Nutch
  2. NUTCH-52

Parser plugin for MS Excel files

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8
    • Component/s: fetcher
    • Labels:
      None

      Description

      Nutch plugin to parse MSExcel files (using jakarta poi) and based on the MSPowerPointParser plugin by Stephan Strittmatter.

      1. parse-msexcel.zip
        928 kB
        Rohit Kulkarni
      2. MSExcelParser.java
        4 kB
        Renat Lumpau

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        290d 11h 3m 1 Jerome Charron 11/Feb/06 02:09
        Resolved Resolved Closed Closed
        255d 14h 4m 1 Sami Siren 24/Oct/06 17:14
        Sami Siren made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Sami Siren added a comment -

        closing issues for released versions

        Show
        Sami Siren added a comment - closing issues for released versions
        Jerome Charron made changes -
        Fix Version/s 0.8-dev [ 12310224 ]
        Resolution Fixed [ 1 ]
        Status Open [ 1 ] Resolved [ 5 ]
        Show
        Jerome Charron added a comment - http://svn.apache.org/viewcvs.cgi?rev=376768&view=rev
        Renat Lumpau made changes -
        Attachment MSExcelParser.java [ 12312133 ]
        Hide
        Renat Lumpau added a comment -

        I had to hack MSExcelParser.java to get this working with nutch-0.7. I've attached the modified file.

        Show
        Renat Lumpau added a comment - I had to hack MSExcelParser.java to get this working with nutch-0.7. I've attached the modified file.
        Rohit Kulkarni made changes -
        Field Original Value New Value
        Attachment parse-msexcel.zip [ 19774 ]
        Hide
        Rohit Kulkarni added a comment -

        The plugin is tested with the latest nutch SVN and seems to work
        fine. Currently only STRING and NUMERIC Excel cell data types are being considered.
        Please try it out and let me know if anyone has any suggestions.

        Plugin is attached as a zip file

        thanks,

        Rohit

        Show
        Rohit Kulkarni added a comment - The plugin is tested with the latest nutch SVN and seems to work fine. Currently only STRING and NUMERIC Excel cell data types are being considered. Please try it out and let me know if anyone has any suggestions. Plugin is attached as a zip file thanks, Rohit
        Rohit Kulkarni created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Rohit Kulkarni
          • Votes:
            3 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development