Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-5723

MARKTABLE fails to assign feature for single word entry in first CSV column

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.6.1ruta
    • None
    • Ruta
    • None

    Description

      When using Ruta's MARKTABLE action with a CSV file nl_law_names.csv like this

      WAZ;WAZELF
      Wet arbeidsongeschiktheidsverzekering zelfstandigen;WAZELF
      

      and corresponding Ruta script containing these lines

      WORDTABLE LawNameTable = 'nl_law_names.csv';
      Document{->MARKTABLE(WetNaam, 1, LawNameTable, "WetIdentifier" = 2)};
      

      it seems that the text WAZ is detected, but the WetIdentifier feature of the resulting annotation is not filled by the string following the semicolon. Instead, it remains empty.

      (Note: WetNaam annotation is defined elsewhere via type system description)

      In contrast, the fully written name Wet arbeidsongeschiktheidsverzekering zelfstandigen is detected and processed as expected with feature WetIdentifier = WAZELF after annnotating.

      Could it be that problems arise when only a single word (i.e. no spaces or uppercase letters following lowercase chars) is present in the first column in the CSV file? Or is it a matter of configuration?

      We experimented also with the optional arguments of MARKTABLE regarding uppercase/lowercase distinction, but to no avail.

      Attachments

        Activity

          People

            pkluegl Peter Klügl
            andreasdot Andreas Thiel
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: