Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2602

iCalendar not properly recognized as text/calendar

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      At the moment the detection of text/calender is covered by the following mime-type-element:

        <mime-type type="text/calendar">
          <magic priority="50">
            <match value="BEGIN:VCALENDAR" type="string" offset="0">
              <match value="VERSION:2.0" type="string" offset="15:30"/>
            </match>
          </magic>
          <glob pattern="*.ics"/>
          <glob pattern="*.ifb"/>
          <sub-class-of type="text/plain"/>
        </mime-type>
      

      This recognition will fail, if VERSION:2.0 is not the first property after BEGIN:VCALENDAR.
      Since this is not always the case (check: https://tools.ietf.org/html/rfc5545 3.6. Calendar Components) recognition may fail for calendar objects with PRODID or other properties:

      Section "4. iCalendar Object Examples" shows some of these cases:

             BEGIN:VCALENDAR
             PRODID:-//xyz Corp//NONSGML PDA Calendar Version 1.0//EN
             VERSION:2.0
             BEGIN:VEVENT
             DTSTAMP:19960704T120000Z
             UID:uid1@example.com
             ORGANIZER:mailto:jsmith@example.com
             DTSTART:19960918T143000Z
             DTEND:19960920T220000Z
             STATUS:CONFIRMED
             CATEGORIES:CONFERENCE
             SUMMARY:Networld+Interop Conference
             DESCRIPTION:Networld+Interop Conference
               and Exhibit\nAtlanta World Congress Center\n
              Atlanta\, Georgia
             END:VEVENT
             END:VCALENDAR
      

      or

             BEGIN:VCALENDAR
             METHOD:xyz
             VERSION:2.0
             PRODID:-//ABC Corporation//NONSGML My Product//EN
             BEGIN:VEVENT
             DTSTAMP:19970324T120000Z
             SEQUENCE:0
             UID:uid3@example.com
             ORGANIZER:mailto:jdoe@example.com
             ATTENDEE;RSVP=TRUE:mailto:jsmith@example.com
             DTSTART:19970324T123000Z
             DTEND:19970324T210000Z
             CATEGORIES:MEETING,PROJECT
             CLASS:PUBLIC
             SUMMARY:Calendaring Interoperability Planning Meeting
             DESCRIPTION:Discuss how we can test c&s interoperability\n
              using iCalendar and other IETF standards.
             LOCATION:LDB Lobby
             ATTACH;FMTTYPE=application/postscript:ftp://example.com/pub/
              conf/bkgrnd.ps
             END:VEVENT
             END:VCALENDAR
      

      I suggest to either
      a) widen the offset of the VERSION-match from 15:30 to 15:200 or sth. like that (not so good approach, since we don't know how Long the PRODID might be)
      or
      b) to add sub-matches for CALSCALE, PRODID, METHOD. (This might still not cover everything, since there are x-prop and iana-prop properties. For now I can only confirm that there are PRODID or METHOD as first property after BEGIN:VCALENDAR.)

      Regards

      Andreas

      Attachments

        1. VERSION_Test
          0.3 kB
          Andreas Meier

        Activity

          People

            Unassigned Unassigned
            AndreasMeier Andreas Meier
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: