Details

    Description

      There is more to this issue than meets the eye. The stringr::str_to_sentence() does 2 things:

      • capitalise the first word
      • if there are multiple sentences provided as a single string, attempts to find sentence breaks and capitalise the first word of each sentence.

      The stringr implementation wraps stringi::str_trans_totitle(), which in turns uses ICU’s BreakIterator to locate specific text boundaries. As a consequence stringr::str_to_title() is not able to identify a full stop / period (".") as a sentence end and does not capitalise words following it. Thus, there is a discrepancy between behaviour of the utf8_capitalize kernel (which capitalises the first word of a string without making any attempt to break into sentences) and the behaviour of stringr::str_to_sentence().

      For more extensive discussions around the stringi / stringr implementation see stringr issues 202 and 231.

      Due to the complexity of this issue and the relatively niche use cases, the recommendation is to postpone implementation.

      Attachments

        Issue Links

          Activity

            People

              dragosmg Dragoș Moldovan-Grünfeld
              thisisnic Nicola Crane
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h