Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Incomplete
    • Affects Version/s: core 1.4.3
    • Fix Version/s: None
    • Labels:
      None
    • Environment:
      OS: Windows 2003 sp2 My-eclipse6.0 / tomcat 5.5 and Athelon500+

      Description

      Hi,
      I have a .doc file which contains data inside a table. Now i want to parse the table to get the table values. Normal Parsing is not working for table( I mean using String tokenizer) because it is giving some unwanted special characters while parsing the table. So I just want to convert that .doc to .txt file, then only it is easy to split the values. But i can't make it! Can any one please tell me how to parse a MS WORD TABLE Values?

      We need to know the process by which we can index a doc file excluding special characters,
      When we will show the excerpt then these special characters make it unreadable.

      Thanks in advance.

        Activity

        Rajesh Upadhyay created issue -
        Jukka Zitting made changes -
        Field Original Value New Value
        Workflow jira [ 12447527 ] no-reopen-closed, patch-avail [ 12467826 ]
        Hide
        Jukka Zitting added a comment -

        Without an example document there's little we can do about this. See the Tika project (http://tika.apache.org/) for the text extraction functionality Jackrabbit nowadays uses, and file an issue at https://issues.apache.org/jira/browse/TIKA if the problem still occurs with Tika.

        Show
        Jukka Zitting added a comment - Without an example document there's little we can do about this. See the Tika project ( http://tika.apache.org/ ) for the text extraction functionality Jackrabbit nowadays uses, and file an issue at https://issues.apache.org/jira/browse/TIKA if the problem still occurs with Tika.
        Jukka Zitting made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Incomplete [ 4 ]
        Jukka Zitting made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        726d 3h 44m 1 Jukka Zitting 29/Nov/10 14:13
        Resolved Resolved Closed Closed
        302d 7h 26m 1 Jukka Zitting 27/Sep/11 22:39

          People

          • Assignee:
            Unassigned
            Reporter:
            Rajesh Upadhyay
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development