Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2071

disk size is much large than actually data size

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 1.3.0
    • Fix Version/s: n/a
    • Component/s: tserver
    • Labels:
    • Environment:
    • Target Version/s:
    • Flags:
      Important

      Description

      I ran m -rf on all the data dirs before reinstalling the cluster, and insert 1000000 records to the cluster using yscb, data's size is about 5GB,but it cost disk size 260GB, one of node 's disk as follows:
      before write data:
      [root@ip-10-1-42-124 ~]# du -sh /data1/server/kudu/tserver_wal/wals/ /data2/server/kudu/tserver_data/ /data3/server/kudu/tserver_data/data/ /data4/server/kudu/tserver_data/data/
      4.0K /data1/server/kudu/tserver_wal/wals/
      24K /data2/server/kudu/tserver_data/
      8.0K /data3/server/kudu/tserver_data/data/
      8.0K /data4/server/kudu/tserver_data/data/

      after write data:
      [root@ip-10-1-42-124 ~]# du -sh /data1/server/kudu/tserver_wal/wals/ /data2/server/kudu/tserver_data/ /data3/server/kudu/tserver_data/data/ /data4/server/kudu/tserver_data/data/
      2.7G /data1/server/kudu/tserver_wal/wals/
      29G /data2/server/kudu/tserver_data/
      29G /data3/server/kudu/tserver_data/data/
      27G /data4/server/kudu/tserver_data/data/

      actually data size :
      9b137115cfaa427a9106c87086f41957 5041*3 MBytes

      kudu tserver configure:
      --fs_wal_dir=/var/lib/kudu/tserver
      --fs_data_dirs=/var/lib/kudu/tserver
      --default_num_replicas=3
      --tserver_master_addrs=192.168.1.22:7051,1192.168.1.23:7051,192.168.1.24:7051,192.168.1.25:7051,192.168.1.26:7051
      --maintenance_manager_num_threads=4
      --block_cache_capacity_mb=10240
      --memory_limit_hard_bytes=60000000000
      --fs_wal_dir=/data1/server/kudu/tserver_wal
      --fs_data_dirs=/data2/server/kudu/tserver_data,/data3/server/kudu/tserver_data,/data4/server/kudu/tserver_data
      --fs_data_dirs_reserved_bytes=10000000000
      --log_segment_size_mb=8

      and our production environment 's data is 25TB, but cost 45TB, where do these disks go?

        Activity

        Hide
        jdcryans Jean-Daniel Cryans added a comment -

        Hey KingLee, this looks like a classical case of KUDU-1943.

        Show
        jdcryans Jean-Daniel Cryans added a comment - Hey KingLee , this looks like a classical case of KUDU-1943 .
        Hide
        jdcryans Jean-Daniel Cryans added a comment -

        I moved this jira from being "In Review" which is a status that we only use when patches are actually up for review, to "Duplicate" since this is really KUDU-1943.

        Show
        jdcryans Jean-Daniel Cryans added a comment - I moved this jira from being "In Review" which is a status that we only use when patches are actually up for review, to "Duplicate" since this is really KUDU-1943 .

          People

          • Assignee:
            Unassigned
            Reporter:
            King Lee KingLee
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 168h
              168h
              Remaining:
              Remaining Estimate - 168h
              168h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development