Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-2627

internal error during LOAD (found by Nick)

    XMLWordPrintableJSON

Details

    Description

      RUN THIS:

      USE Personicle;
      //dropping and recreating since AsterixDB requires a virgin
      DROP DATASET events IF EXISTS;
      CREATE DATASET events(EventType) PRIMARY KEY eventId AUTOGENERATED;
      LOAD DATASET events USING localfs
      (("path"=":///Users/macandcheese/Desktop/Personicle/old-personicle-parser/sample-event-dump.jsonl"),("format"="adm"));

      BUT FIRST RUN THIS (ALSO ATTACHED) TO SET THINGS UP:

      --To begin we’ll go ahead and delete any existing dataverse named Personicle
      --this ensures we’re writing into an empty dataverse (note ALL data in the
      --dataverse will be deleted, so be careful). This is included so that this
      --code can be re-run without causing naming conflicts, but is not required
      --to create a dataverse.

      DROP DATAVERSE Personicle IF EXISTS;

      --Now that we have a clean slate let’s create our dataverse and call it
      --Personicle, though the name is arbitrary.

      CREATE DATAVERSE Personicle;

      --Sets the current working dataverse for whatever SQL++ statements follow it.
      --This is akin to USE in SQL.

      USE Personicle;

      --The question mark represents an optional field which may or may not be present.
      --I’ve also specified some uuid types which I’ll later use for some auto-generated
      --primary keys. Finally, ADM supports types within types, so you can use custom
      --types for the fields. I’m not going to delve too deep into this, because I think the
      --documentation explains it sufficiently more or less. Here are a couple links to
      --get you started:
      --https://ci.apache.org/projects/asterixdb/datamodel.html
      --https://ci.apache.org/projects/asterixdb/sqlpp/primer-sqlpp.html

      CREATE TYPE AddressType AS OPEN

      { category: string?, address: string, city: string, state: string, postalCode: string }

      ;
      CREATE TYPE PhoneDetailType AS OPEN

      { category: string?, phone: string }

      ;
      CREATE TYPE EmailDetailType AS OPEN

      { category: string?, email: string }

      ;
      CREATE TYPE UserType AS OPEN

      { --primary key uuid for user userId: uuid, name: string?, dateOfBirth: date?, addresses: [AddressType]?, phones: [PhoneDetailType]?, emails: [EmailDetailType]? }

      ;

      CREATE TYPE ObservationType AS OPEN

      { observationId: uuid, --primary key uuid for observation dataSource: uuid, --foreign key to the data source which generated the observation, like a smart device, API or human. about: uuid, --foreign key to the user this observation is about label: string?, --labels are human readable identifiers, for example "location observation for James" start_at: datetime, --when the observation began being observed end_at: datetime? --when the observation stopped being observed, some observations won't need this }

      ;
      CREATE TYPE EventType AS OPEN

      { eventId: uuid, --primary key uuid for event participant: uuid, --foreign key to user that participated in this event start_at: datetime, --start of event end_at: datetime, --end of event using `` because of a potential AsterixDB bug name: string, level: bigint?, --Jordan's paper referred to different levels of daily activities. observations: [uuid] --foreign keys to ids of observations which were used to make up this event }

      ;
      CREATE TYPE DataSourceType AS OPEN

      { dataSourceId: uuid, --primary key uuid for data sources owner: uuid, --foreign key to the user who owns this data source label: string?, --human-readable string for the data source, example: "Mike's iPhone" personicleHost: string? --this represents metadata about the Personicle application which was used for this data source, example: "Personicle Android application v1.2.43" }

      ;

      --The code below creates collections using the ADM specified above. --Therefore this code can be run as boilerplate code to get started with using AsterixDB.
      --Note that I've specified the primary key to be "AUTOGENERATED" which
      --means AsterixDB will create one for us if the UUID key is omitted,
      --if it is provided AsterixDB will use the UUID that was provided.

      USE Personicle;

      --We will have one collection for each of the ER diagram's top-level entity type instances.
      --(Instances of their subtypes will conform to the collection's type but will also have additional attributes
      --appropriate to their particular subtype. AsterixDB's open types make this possible.) In MongoDB, we would
      --have these same collections. We just wouldn't bother to define any of their type info (and the system wouldn't
      --do any checking); everything would be by convention/documentation, but the data would be exactly the same.

      CREATE DATASET users(UserType) PRIMARY KEY userId AUTOGENERATED;
      CREATE DATASET observations(ObservationType) PRIMARY KEY observationId AUTOGENERATED;
      CREATE DATASET events(EventType) PRIMARY KEY eventId AUTOGENERATED;
      CREATE DATASET datasources(DataSourceType) PRIMARY KEY dataSourceId AUTOGENERATED;

      Attachments

        1. MASTER-personicle-espresso-schema-configuration-script.sqlpp
          4 kB
          Michael J. Carey
        2. sample-event-dump.jsonl
          5 kB
          Michael J. Carey
        3. logs.zip
          24 kB
          Michael J. Carey

        Activity

          People

            imaxon Ian Maxon
            dtabass Michael J. Carey
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: