Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Incomplete
    • Affects Version/s: core 1.4.3
    • Fix Version/s: None
    • Labels:
      None
    • Environment:
      OS: Windows 2003 sp2 My-eclipse6.0 / tomcat 5.5 and Athelon500+

      Description

      Hi,
      I have a .doc file which contains data inside a table. Now i want to parse the table to get the table values. Normal Parsing is not working for table( I mean using String tokenizer) because it is giving some unwanted special characters while parsing the table. So I just want to convert that .doc to .txt file, then only it is easy to split the values. But i can't make it! Can any one please tell me how to parse a MS WORD TABLE Values?

      We need to know the process by which we can index a doc file excluding special characters,
      When we will show the excerpt then these special characters make it unreadable.

      Thanks in advance.

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        726d 3h 44m 1 Jukka Zitting 29/Nov/10 14:13
        Resolved Resolved Closed Closed
        302d 7h 26m 1 Jukka Zitting 27/Sep/11 22:39
        Jukka Zitting made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Jukka Zitting made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Incomplete [ 4 ]
        Hide
        Jukka Zitting added a comment -

        Without an example document there's little we can do about this. See the Tika project (http://tika.apache.org/) for the text extraction functionality Jackrabbit nowadays uses, and file an issue at https://issues.apache.org/jira/browse/TIKA if the problem still occurs with Tika.

        Show
        Jukka Zitting added a comment - Without an example document there's little we can do about this. See the Tika project ( http://tika.apache.org/ ) for the text extraction functionality Jackrabbit nowadays uses, and file an issue at https://issues.apache.org/jira/browse/TIKA if the problem still occurs with Tika.
        Jukka Zitting made changes -
        Field Original Value New Value
        Workflow jira [ 12447527 ] no-reopen-closed, patch-avail [ 12467826 ]
        Rajesh Upadhyay created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Rajesh Upadhyay
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development