Uploaded image for project: 'S2Graph'
  1. S2Graph
  2. S2GRAPH-50

Provide new HBase Storage Schema

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Done
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None

      Description

      I think we need to provide choice for both for `Tall` and `Wide` row for IndexEdge. The fatal difference between these two would be following.

      1. Wide.
      if we store adjacent edges on single row with wide column and use get request to get adjacent edges. This is how IndexEdge is currently stored.

      2. Tall.
      adjacent edges are on multiple `consecutive` rows and we use scanner to scan through them.

      once S2GRAPH-17 is resolved, then I think only thing we have to do is provide `IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on HBase and I think this is very trivial task since we all have primitives for this.

      The hard part would be changing interface for client.

      currently query support `offset` and `limit` for pagination. if we use scanner, then there is no easy way to support `offset`.

      I think it is worth to try with Tall row schema and benchmark them over Wide row schema. also I think this is very beneficial for others who is interested in implementing other storage such as RocksDB or LevelDB(including myself).

      I will followup with benchmark on both `Tall` and `Wide` row then we can decide what schema should be default. What others think?

        Attachments

          Activity

            People

            • Assignee:
              steamshon Do Yung Yoon
              Reporter:
              steamshon Do Yung Yoon
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 168h
                168h
                Remaining:
                Remaining Estimate - 168h
                168h
                Logged:
                Time Spent - Not Specified
                Not Specified