Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8707

Implement an async pure c++ HDFS client

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.0
    • Component/s: hdfs-client
    • Labels:
      None

      Description

      As part of working on the C++ ORC reader at ORC-3, we need an HDFS pure C++ client that lets us do async io to HDFS. We want to start from the code that Haohui's been working on at https://github.com/haohui/libhdfspp .

        Attachments

        Issue Links

        1.
        Integrate the build infrastructure with hdfs-client Sub-task Resolved Haohui Mai Actions
        2.
        Import third_party libraries into the repository Sub-task Resolved Haohui Mai Actions
        3.
        Use std::chrono to implement the timer in the asio library Sub-task Resolved Haohui Mai Actions
        4.
        Initial implementation of a Hadoop RPC v9 client Sub-task Resolved Haohui Mai Actions
        5.
        Use Doxygen to generate documents for libhdfspp Sub-task Resolved Haohui Mai Actions
        6.
        Implement the continuation library for libhdfspp Sub-task Resolved Haohui Mai Actions
        7.
        Implement remote block reader in libhdfspp Sub-task Resolved Haohui Mai Actions
        8.
        Generate Hadoop RPC stubs from protobuf definitions Sub-task Resolved Haohui Mai Actions
        9.
        Implement a libhdfs(3) compatible API Sub-task Resolved James Clampffer Actions
        10.
        Implement FileSystem and InputStream API for libhdfspp Sub-task Resolved Haohui Mai Actions
        11.
        SASL support for data transfer protocol in libhdfspp Sub-task Resolved Haohui Mai Actions
        12.
        Implement unit tests for remote block reader in libhdfspp Sub-task Resolved Haohui Mai Actions
        13.
        InputStream.PositionRead() should be aware of available DNs Sub-task Resolved Haohui Mai Actions
        14.
        Fix compilation issues on arch linux Sub-task Resolved Owen O'Malley Actions
        15.
        Initialize protobuf fields in RemoteBlockReaderTest Sub-task Resolved Haohui Mai Actions
        16.
        RPC client should fail gracefully when the connection is timed out or reset Sub-task Resolved Haohui Mai Actions
        17.
        Retry reads on DN failure Sub-task Resolved James Clampffer Actions
        18.
        InputStreamImpl::ReadBlockContinuation stores wrong pointers of buffers Sub-task Resolved Haohui Mai Actions
        19.
        Suppress false positives from Valgrind on uninitialized variables in tests Sub-task Resolved Haohui Mai Actions
        20.
        Config file reader / options classes for libhdfs++ Sub-task Resolved Bob Hansen Actions
        21.
        Add logging system for libdhfs++ Sub-task Resolved James Clampffer Actions
        22.
        Move the implementation to the hdfs-native-client module Sub-task Resolved Haohui Mai Actions
        23.
        Refactor libhdfs into stateful/ephemeral objects Sub-task Resolved Bob Hansen Actions
        24.
        Simplify embedding libhdfspp into other projects Sub-task Resolved James Clampffer Actions
        25.
        Add valgrind suppression for statically initialized library objects Sub-task Resolved James Clampffer Actions
        26.
        libhdfs++ should respect NN retry configuration settings Sub-task Resolved Bob Hansen Actions
        27.
        InputStreamImpl should hold a shared_ptr of the BlockReader Sub-task Resolved James Clampffer Actions
        28.
        Implement basic NN operations Sub-task Resolved Anatoli Shein Actions
        29.
        Implement a unix-like cat utility Sub-task Resolved James Clampffer Actions
        30.
        Import RapidXML 1.13 for libhdfspp Sub-task Resolved Bob Hansen Actions
        31.
        libhdfspp should use sizeof(int32_t) instead of sizeof(int) when parsing data Sub-task Resolved James Clampffer Actions
        32.
        Allow the location of hadoop source tree resources to be passed to CMake during a build. Sub-task Resolved Bob Hansen Actions
        33.
        Create a generic function to synchronize async functions and methods. Sub-task Resolved James Clampffer Actions
        34.
        libhdfspp fails to compile after HDFS-9207 Sub-task Resolved Haohui Mai Actions
        35.
        Test libhdfs++ with existing libhdfs tests Sub-task Resolved Stephen Actions
        36.
        Get libhdfs++ gmock tests running with CI Sub-task Resolved Haohui Mai Actions
        37.
        Implement reads with implicit offset state in libhdfs++ Sub-task Resolved James Clampffer Actions
        38.
        HDFS-8707 builds are failing with protobuf directories as undef Sub-task Resolved Haohui Mai Actions
        39.
        Build both static and dynamic libraries for libhdfspp Sub-task Resolved Stephen Actions
        40.
        Clean up the RAT warnings in the HDFS-8707 branch. Sub-task Resolved Bob Hansen Actions
        41.
        Import the optional library into libhdfs++ Sub-task Resolved Bob Hansen Actions
        42.
        Be able to build/test libhdfspp without Maven Sub-task Resolved Unassigned Actions
        43.
        Enable valgrind for libhdfspp unit tests Sub-task Resolved Bob Hansen Actions
        44.
        libhdfs++ Fix memory stomp in OpenFileForRead. Sub-task Resolved James Clampffer Actions
        45.
        Fix protobuf runtime warnings Sub-task Resolved Unassigned Actions
        46.
        libhdfs++: suppress warnings from third-party libraries Sub-task Resolved Bob Hansen Actions
        47.
        Documentation needs to be exposed Sub-task Resolved Unassigned Actions
        48.
        No header files in mvn package Sub-task Resolved Unassigned Actions
        49.
        libhdfs++ Fix valgrind failures when using more than 1 io_service worker thread. Sub-task Resolved James Clampffer Actions
        50.
        libhdfs++ Enable builds with no compiler optimizations Sub-task Resolved Bob Hansen Actions
        51.
        Enable CI infrasructure to use libhdfs++ hdfsRead Sub-task Resolved Stephen Actions
        52.
        libhdfs++: move lib/proto/cpp_helpers to third-party since it won't have an ASF license Sub-task Resolved Bob Hansen Actions
        53.
        libhdfs++ Initialize BadNodeTracker in FileSystemImpl constructor Sub-task Resolved James Clampffer Actions
        54.
        libhdfs++: failure to connect to ipv6 host causes CI unit tests to fail Sub-task Resolved Bob Hansen Actions
        55.
        libhdfs++ deadlocks in Filesystem::New if NN conneciton fails Sub-task Resolved Bob Hansen Actions
        56.
        libhdfs++: implement HDFSConfiguration class Sub-task Resolved Bob Hansen Actions
        57.
        libhdfs++: load configuration from files Sub-task Resolved Bob Hansen Actions
        58.
        libhdfs++: pull Options from default configs by default Sub-task Resolved Bob Hansen Actions
        59.
        libhfds++: Allow seek to EOF Sub-task Resolved Bob Hansen Actions
        60.
        libhdfs++ Add runtime hooks to allow a client application to add low level monitoring and tests. Sub-task Resolved Bob Hansen Actions
        61.
        libhdfs++: Add a mechanism to retrieve human readable error messages through the C API Sub-task Resolved James Clampffer Actions
        62.
        libhdfs++: Implement builder apis from C bindings Sub-task Resolved Bob Hansen Actions
        63.
        libhdfs++: Add additional type-safe getters to the Configuration class Sub-task Resolved Unassigned Actions
        64.
        libhdfs++: for consistency, include files should be in hdfspp Sub-task Resolved Bob Hansen Actions
        65.
        libhdfs++: Support async cancellation of read operations Sub-task Resolved James Clampffer Actions
        66.
        libhdfs++: Fix inconsistencies with libhdfs C API Sub-task Resolved James Clampffer Actions
        67.
        libhdfs++: potential segfault after teardown Sub-task Resolved Bob Hansen Actions
        68.
        libhdfs++: Add appropriate catch blocks for ASIO operations that throw Sub-task Resolved Bob Hansen Actions
        69.
        libhdfs++: Reimplement Status object as a normal struct Sub-task Resolved James Clampffer Actions
        70.
        libhdfs++: Create examples of consuming libhdfs++ Sub-task Resolved Unassigned Actions
        71.
        libhdfs++: Implement simple authentication Sub-task Resolved Bob Hansen Actions
        72.
        libhdfs++: GetLastError not returning meaningful messages after some failures Sub-task Resolved Bob Hansen Actions
        73.
        libhdfs++: RPC engine will attempt to close an asio socket before it's been opened Sub-task Resolved James Clampffer Actions
        74.
        libhdfs++: Client fails to pass TokenProto from LocatedBlockProto to server when reading a block Sub-task Resolved James Clampffer Actions
        75.
        libhfds++: ConfigurationLoader throws parse_exception on invalid input Sub-task Resolved Bob Hansen Actions
        76.
        libhdfs++: EACCES not setting errno correctly Sub-task Resolved Bob Hansen Actions
        77.
        libhdfs++: Cancel outstanding operations without calling shutdown Sub-task Resolved Bob Hansen Actions
        78.
        libhfds++: C++ exceptions should never escape the C API Sub-task Resolved Unassigned Actions
        79.
        libhdfs++: Add test suite to simulate network issues Sub-task Resolved Xiaowei Zhu Actions
        80.
        libhdfs++: add hooks to facilitate fault testing Sub-task Resolved Bob Hansen Actions
        81.
        libhdfs++: find a URI parsing library Sub-task Resolved Bob Hansen Actions
        82.
        libhdfs++: Implement debug allocators Sub-task Resolved Xiaowei Zhu Actions
        83.
        libhdfs++: Integrate logging with the C API Sub-task Resolved Unassigned Actions
        84.
        libhdfs++: Shutdown sockets to avoid "Connection reset by peer" Sub-task Resolved James Clampffer Actions
        85.
        libhdfs++: Fix race conditions in RPC layer Sub-task Resolved Bob Hansen Actions
        86.
        libhdfs++: Datanode protocol version mismatch Sub-task Resolved James Clampffer Actions
        87.
        libhdfs++: File length doesn't always count the last block if it's being written to Sub-task Resolved Xiaowei Zhu Actions
        88.
        libhdfs++: hdfsConnect hangs when given bad host or port Sub-task Resolved James Clampffer Actions
        89.
        libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket Sub-task Resolved James Clampffer Actions
        90.
        hdfs-native-client fails to build with CMake 2.8.11 or earlier Sub-task Resolved Tibor Kiss Actions
        91.
        libhdfs++: Add SASL authentication Sub-task Resolved Bob Hansen Actions
        92.
        libhdfs++: Get rid of lock in RpcConnectionImpl destructor Sub-task Resolved James Clampffer Actions
        93.
        libhdfs++: HA namenode support Sub-task Resolved James Clampffer Actions
        94.
        libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc Sub-task Resolved Bob Hansen Actions
        95.
        libhdfspp: Move NameNodeOp to a separate file Sub-task Resolved Anatoli Shein Actions
        96.
        libhdfs++: Implement GetPathInfo and ListDirectory Sub-task Resolved Bob Hansen Actions
        97.
        libhdfs++: Implement GetBlockLocations Sub-task Resolved Bob Hansen Actions
        98.
        libhdfs++: Implement GetFsStats Sub-task Resolved Anatoli Shein Actions
        99.
        libhdfs++: Implement snapshot operations and GetFsStats Sub-task Resolved Anatoli Shein Actions
        100.
        libhfds++: if HA is available, authentication doesn't get parsed in configs Sub-task Resolved Bob Hansen Actions
        101.
        libhdfs++: make error returning mechanism consistent across all hdfs operations Sub-task Resolved Anatoli Shein Actions
        102.
        libhdfs++: Implement mkdirs, rmdir, rename, and remove Sub-task Resolved Anatoli Shein Actions
        103.
        libhdfs++: Implement chmod and chown Sub-task Resolved Anatoli Shein Actions
        104.
        libhdfs++: Add connect timeouts to async_connect calls Sub-task Resolved Bob Hansen Actions
        105.
        libhdfs++: hdfsGetBlockLocations doesn't null terminate ip address strings Sub-task Resolved James Clampffer Actions
        106.
        libhdfs++: Silence compile warnings from URI parser Sub-task Resolved James Clampffer Actions
        107.
        libhdfs++: Client Name Protobuf Error Sub-task Resolved Bob Hansen Actions
        108.
        libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ version of cat tool Sub-task Resolved Anatoli Shein Actions
        109.
        libhdfs++: Implement parallel find with wildcards tool Sub-task Resolved Anatoli Shein Actions
        110.
        libhdfs++: return explicit error when non-secured client connects to secured server Sub-task Resolved Kai Jiang Actions
        111.
        libhdfs++: FileSystem should have a convenience no-args ctor Sub-task Resolved James Clampffer Actions
        112.
        libhdfs++: Expose an InputStream interface for the apache ORC project Sub-task Resolved James Clampffer Actions
        113.
        libhdfs++: In RPC engine replace vector with deque for pending requests Sub-task Resolved Anatoli Shein Actions
        114.
        libhdfs++: Implement recursive directory generator Sub-task Resolved Anatoli Shein Actions
        115.
        libhdfs++: synchronize access to working_directory and bytes_read_. Sub-task Resolved Anatoli Shein Actions
        116.
        libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find. Sub-task Resolved Anatoli Shein Actions
        117.
        libhdfs++: Fix broken logic in HA retry policy Sub-task Resolved James Clampffer Actions
        118.
        libhdfs++: Implement the rest of the tools Sub-task Resolved Anatoli Shein Actions
        119.
        libhdfs++: Public API should expose configuration parser Sub-task Resolved James Clampffer Actions
        120.
        libhdfs++: rationalize ioservice interactions Sub-task Resolved James Clampffer Actions
        121.
        libhdfs++: Public API headers should not depend on internal implementation Sub-task Resolved James Clampffer Actions
        122.
        libhdfs++: Make log levels consistent Sub-task Resolved James Clampffer Actions
        123.
        libhdfs++: Fix object lifecycle issues in the BlockReader Sub-task Resolved James Clampffer Actions
        124.
        libhdfs++: Make connection to HA clusters faster Sub-task Resolved James Clampffer Actions
        125.
        libhdfs++: Don't retry if there is an authentication failure Sub-task Resolved James Clampffer Actions
        126.
        libhdfs++: FileSystem needs to be able to cancel pending connections Sub-task Resolved James Clampffer Actions
        127.
        Expose rack id in hdfsDNInfo Sub-task Resolved Xiaowei Zhu Actions
        128.
        libhdfs++: Some refactoring to better organize files Sub-task Resolved James Clampffer Actions
        129.
        libhdfs++: Segfault in HA failover if DNS lookup for both Namenodes fails Sub-task Resolved James Clampffer Actions
        130.
        libhdfs++: Log Datanode information when reading an HDFS block Sub-task Resolved Xiaowei Zhu Actions
        131.
        libhdfs++: Fix race condition in ScopedResolver Sub-task Resolved James Clampffer Actions
        132.
        libhdfs++: Log Datanode read size when reading an HDFS block Sub-task Resolved Xiaowei Zhu Actions
        133.
        libhdfs++: Add a build option to skip building examples, tests, and tools Sub-task Resolved Anatoli Shein Actions
        134.
        libhdfs++: RPC connection should handle authorization error call id Sub-task Resolved James Clampffer Actions
        135.
        libhdfs++: Catch exceptions thrown by runtime hooks Sub-task Resolved James Clampffer Actions
        136.
        libhdfs++: SASL events should be scoped closer to usage Sub-task Resolved James Clampffer Actions
        137.
        libhdfs++: Get minidfscluster tests running under valgrind Sub-task Resolved Anatoli Shein Actions
        138.
        libhdfs++: Authentication failure when first NN of kerberized HA cluster is standby Sub-task Resolved James Clampffer Actions
        139.
        libhdfs++: Docker script fails while trying to download JDK 7. Sub-task Resolved Anatoli Shein Actions
        140.
        libhdfs++: A few portability issues Sub-task Resolved Anatoli Shein Actions
        141.
        libhdfs++: read with offset at EOF should return 0 bytes instead of error Sub-task Resolved Xiaowei Zhu Actions
        142.
        libhdfs++: Fix compilation errors and warnings when compiling with Clang Sub-task Resolved Anatoli Shein Actions
        143.
        libhdfs++: add Clang build and tests to the CI system Sub-task Resolved Anatoli Shein Actions
        144.
        libhdfs++: Provide workaround to support cancel on filesystem connect until HDFS-11437 is resolved Sub-task Resolved James Clampffer Actions
        145.
        libhdfs++: Make sure all steps in SaslProtocol end up calling AuthComplete Sub-task Resolved James Clampffer Actions
        146.
        libhdfs++: Rebase 8707 branch onto an up to date version of trunk Sub-task Resolved Deepak Majeti Actions
        147.
        libhdfs++: Add a synchronization interface for the GSSAPI Sub-task Resolved James Clampffer Actions
        148.
        libhdfs++: PROTOC_IS_COMPATIBLE check fails if protobuf library is built from source Sub-task Resolved Anatoli Shein Actions
        149.
        libhdfs++: Prevent Requests from holding dangling pointer to RpcEngine Sub-task Resolved James Clampffer Actions

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              James C James Clampffer Assign to me
              Reporter:
              omalley Owen O'Malley

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment