Uploaded image for project: 'VXQuery (Retired)'
  1. VXQuery (Retired)
  2. VXQUERY-131

Supporting Hadoop and Yarn

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Many organizations support Hadoop. It would be nice to be able to read data from this source. The project will include creating a strategy (with the mentor's guidance) for reading XML data from HDFS and implementing it. When connecting VXQuery to HDFS, the strategy may need to consider how to read sections of an XML file.

      We could use Yarn as our cluster manager. The Apache Hadoop YARN (Yet Another Resource Negotiator) would be a good cluster management tool for VXQuery. If VXQuery can read data from HDFS, then why not also manage the cluster with a tool provided by Hadoop. The solution would replace the current custom python scripts for cluster management.

      Goal

      • Read XML from HDFS
      • Manage cluster with YARN

      Attachments

        Issue Links

          Activity

            People

              sjaco002 Steven Jacobs
              prestonc Preston Carman
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated: