Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16770

Compare two directories in HDFS filesystem for every 5 mins interval for same cluster. (smiliar like diff command in linux)

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 2.10.0
    • None
    • hdfs-client
    • None

    Description

      Hi team,

      Created two hadoop clusters, one cluster is storing files in new directories based on TIME based directories are created in Hadoop FileSystem say /a/b/time/a.txt b.txt..

      For every 5 mins, compare this cluster 1 filesytem for two different directories whether any new directories with list of files are updated or not , if its updated in dir 1, then update those files only to be moved to dir 2. Later those new directories files copied to HDFS cluster 2 file system. 

      Currently HDFS not supported hdfs dfs -diff command,  Any solution for this?

      Have tried  -copyFromLocal and copyToLocal command, it uses lot of diskspace while copying local to hdfs & hdfs to local.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Sayee7 GanGSTR
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: