Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-1930

How to load data and then start a fuseki server with that dataset and use it?

    XMLWordPrintableJSON

Details

    • Question
    • Status: Closed
    • Minor
    • Resolution: Feedback Received
    • Jena 3.14.0, Jena 3.15.0
    • None
    • None
    • None

    Description

      With my (unfortunately not successful) tdb2.tdbloader experience for wikidata I modified my script to load a smaller dataset like shown below (gnd2jena). The output log seems to be ok.

      15:01:25 INFO loader :: Add: 9,500,000 authorities-kongress_lds.ttl (Batch: 28,457 / Avg: 33,975)
      15:01:29 INFO loader :: Finished: authorities-kongress_lds.ttl: 9,556,155 tuples in 282.97s (Avg: 33,771)
      15:02:06 INFO loader :: Finish - index SPO
      15:02:12 INFO loader :: Finish - index POS
      15:02:12 INFO loader :: Finish - index OSP
      15:02:12 INFO loader :: Time = 326.504 seconds : Triples = 9,556,155 : Rate = 29,268 /s

      Now I am trying to run a fuseki server using the data like this:

      #!/bin/bash
      # WF 2020-06-25
      # Jena Fuseki server installation
      # see https://jena.apache.org/documentation/fuseki2/fuseki-run.html
      version=3.14.0
      fuseki=apache-jena-fuseki-$version
      if [ ! -d $fuseki ]
      then
       if [ ! -f $fuseki.tar.gz ]
       then
       wget http://archive.apache.org/dist/jena/binaries/$fuseki.tar.gz
       else
       echo $fuseki.tar.gz already downloaded
       fi
       echo "unpacking $fuseki.tar.gz"
       tar xvfz $fuseki.tar.gz
      else
       echo $fuseki already downloaded and unpacked
      fi
      cd $fuseki
      java -jar fuseki-server.jar --tdb2 --loc=../data /gnd

      Please note that I used version 3.14.0 here given that there where reports about 3.15.0 needing some kind of patching that i hoped to avoid by using the previous version assuming that the tdbstore would still be compatible or a message would show if not.

      At the servers port 3030 a user interface shows up showing that the server status is ok.

      When clicking "Manage datasets" i get two options:

      • existing datasets
      • add new dataset

      None of the two buttons shows and effect. Clicking does not show any visible reaction. I would have expected I would be able to work with the imported dataset immediately (not even knowing whether you'd call the import a dataset ...)

      I already found some hint that I'd have to change the config.ttl manually to get the desired effect. I find it quite confusing that the web UI does not give any hints on this. I placed this as a question - not knowing whether this would end up as a feature request or bug ...

      gnd2jena

       

      #!/bin/bash
      # WF 2020-05-10
      # global settings
      jena=apache-jena-3.15.0
      tgz=$jena.tar.gz
      jenaurl=http://mirror.easyname.ch/apache/jena/binaries/$tgz
      base=/hd/luxio/gnd
      data=$base/data
      tdbloader=$jena/bin/tdb2.tdbloader
      getjena() {
      # download
      if [ ! -f $tgz ]
      then
       echo "downloading $tgz from $jenaurl"
       wget $jenaurl
      else
       echo "$tgz already downloaded"
      fi
      # unpack
      if [ ! -d $jena ]
      then
       echo "unpacking $jena from $tgz"
       tar xvzf $tgz
      else
       echo "$jena already unpacked"
      fi
      # create data directory
      if [ ! -d $data ]
      then
       echo "creating $data directory"
       mkdir -p $data
      else
       echo "$data directory already created"
      fi
      }
      #
      # show the given timestamp
      #
      timestamp() {
       local msg="$1"
       local ts=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
       echo "$msg at $ts"
      }
      #
      # load data for the given data dir and input
      #
      loaddata() {
       local data="$1"
       local input="$2"
       timestamp "start loading $input to $data"
       $tdbloader --loader=parallel --loc "$data" "$input" > tdb2-$phase-out.log 2> tdb2-$phase-err.log
       timestamp "finished loading $input to $data"
      }
      getjena
      export TMPDIR=$base/tmp
      if [ ! -d $TMPDIR ]
      then
       echo "creating temporary directory $TMPDIR"
       mkdir $TMPDIR
      else
       echo "using temporary directory $TMPDIR"
      fi
      if [ ! -f authorities-kongress_lds.ttl ]
      then
       wget https://data.dnb.de/opendata/authorities-kongress_lds.ttl.gz
       gunzip authorities-kongress_lds.ttl.gz
      fi
      loaddata $data authorities-kongress_lds.ttl
      

       

      Attachments

        1. fusekiscreenshot2020-06-26.png
          70 kB
          Wolfgang Fahl

        Activity

          People

            Unassigned Unassigned
            WolfgangFahl Wolfgang Fahl
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: