Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-1379

Replace TDB NodeTableTrans

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • Jena 3.4.0
    • Jena 3.5.0
    • TDB
    • None

    Description

      TDB NodeTableTrans is complicated. It combines an existing NodeTable with an additional index (often in-memory) and a journal-like ObjectFile to hold new nodes added in a transaction. It has to maintain a mapping between the new nodes in the journal-ObjectFile and the eventual location on the main node file. On commit, it writes the journal-ObjectFile nodes to underlying index. There is a problem that writing the index isn't done completely safely. The window of vulnerability is quite small though (coordinating the index update and the object file update).

      NodeTableBuilder is part of the way TDB datasets get built. A simpler design is to make {{NodeTable}}s be built from the basic components on `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed fashion. The potential flexibility of the current design has never been exploited.

      There are two parts to this change: they are independent.

      1. a transactional index (based on the same machinery as the tuple indexes) and directly appending to the object file of the NodeTable.
      2. independent transactional object file.

      Directly appending is safe because these files only grow. Only nodes in the associated index are accessible. Abort resets the append point; a crash during a write transaction can, at worst, create unused junk in the object file but this is a trade-off of speed and recovery. A journalled addition object file would avoid junk in some crash situations, though it imposes a copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make crash-safe.

      The alternative here is not to keep the existing code - there is some unused (and hence no deployment-tested) code in ObjectFileTransComplex (working name) for a more complicated journalled object file.

      The on-disk format is not changed. Switching from Jena 3.4.0 or earlier to Jena 3.5.0 should be safe for valid databases. Going backwards should also work if the database (not tested). The safest way is to require that recovery is done with the same version of TDB with a test in new code that notices and exist if it encounters old files.

      Attachments

        Issue Links

          Activity

            People

              andy Andy Seaborne
              andy Andy Seaborne
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: