Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1609

Leverage Google's LibPhonenumber for enhanced phone number extraction and metadata modeling

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 1.17, 2.0.0-BETA, 2.1.0
    • core
    • None

    Description

      Google's Libphonenumber can provide us with comprehensive support for modeling Phone number metadata properly in Tika.
      During the development of this patch I realized two things, namely

      • This is not a parser as such as Phone numbers are not mapped to any particular Mimetype
      • In addition, there can be many phone numbers per document, so this is most likely a Content Handler of sorts
      • Tika's Metadata support is currently too restrictive to allow us to persist many complex objects e.g. String, Object. We need to expand Meatdata support over and above String, String[].

      https://github.com/googlei18n/libphonenumber/

      Attachments

        Issue Links

          Activity

            People

              lewismc Lewis John McGibbney
              lewismc Lewis John McGibbney
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: