Uploaded image for project: 'Community Development'
  1. Community Development
  2. COMDEV-156

parseprojects.py: Calculation of projectJsonFilename is flawed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Projects Tool
    • None

    Description

      The parseprojects.py script calculates the projectJsonFilename variable from the DOAP homepage entry. If the homepage has path components after tlp.apache.org then the last one is used and appended to the tlp.
      This process is not guaranteed to result in a unique file name. For example the entry
      <homepage rdf:resource="http://commons.apache.org/beanutils/index.html"/>
      is converted to
      commons-index.html
      It so happens that the other Commons DOAPs don't include the index.html so they are unique, but this is just chance. There are other ways that this approach can fail as there is no standard convention for project homepage URLs within a TLP website (nor should one be enforced).

      It would be a lot simpler (and more reliable) to use the DOAP name field.
      Trim the leading Apache, convert to lower case, remove/replace spaces and sanitiize illegal filename characters.
      [The original code only allowed alphanumeric characters plus '-' and '+'. Everything else was converted to '_'.

      There may still be duplicate names, but that is an issue for the project to resolve as that is not allowed (and the code can report duplicates).

      Attachments

        Activity

          People

            Unassigned Unassigned
            sebb Sebb
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: