Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-14470

Repair validation failed/unable to create merkle tree

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Duplicate
    • None
    • None
    • None
    • Normal

    Description

      I had trouble repairing with a full repair across all nodes and keyspaces so I swapped to doing table by table. This table will not repair even after scrub/restart of all nodes. I am using command:

      nodetool repair -full -seq keyspace table
      
      [2018-05-25 19:26:36,525] Repair session 0198ee50-6050-11e8-a3b7-9d0793eab507 for range [(165598500763544933,166800441975877433], (-5455068259072262254,-5445777107512274819], (-4614366950466274594,-4609359222424798148], (3417371506258365094,3421921915575816226], (5221788898381458942,5222846663270250559], (3421921915575816226,3429175540277204991], (3276484330153091115,3282213186258578546], (-3306169730424140596,-3303439264231406101], (5228704360821395206,5242415853745535023], (5808045095951939338,5808562658315740708], (-3303439264231406101,-3302592736123212969]] finished (progress: 1%)
      [2018-05-25 19:27:23,848] Repair session 0180f980-6050-11e8-a3b7-9d0793eab507 for range [(-8495158945319933291,-8482949618583319581], (1803296697741516342,1805330812863783941], (8633191319643427141,8637771071728131257], (2214097236323810344,2218253238829661319], (8637771071728131257,8639627594735133685], (2195525904029414718,2214097236323810344], (-8500127431270773970,-8495158945319933291], (7151693083782264341,7152162989417914407], (-8482949618583319581,-8481973749935314249]] finished (progress: 1%)
      [2018-05-25 19:30:32,590] Repair session 01ac9d62-6050-11e8-a3b7-9d0793eab507 for range [(7887346492105510731,7893062759268864220], (-153277717939330979,-151986584968539220], (-6351665356961460262,-6336288442758847669], (7881942012672602731,7887346492105510731], (-5884528383037906783,-5878097817437987368], (6054625594262089428,6060773114960761336], (-6354401100436622515,-6351665356961460262], (3358411934943460772,3363367777663817876], (6255644242745576360,6278718135193665575], (-6321106762570843270,-6316788220143151823], (1754319239259058661,1759314644652031521], (7893062759268864220,7894890594190784729], (-8012293411840276426,-8011781808288431224]] failed with error [repair #01ac9d62-6050-11e8-a3b7-9d0793eab507 on keyspace/table, [(7887346492105510731,7893062759268864220], (-153277717939330979,-151986584968539220], (-6351665356961460262,-6336288442758847669], (7881942012672602731,7887346492105510731],
      (-5884528383037906783,-5878097817437987368], (6054625594262089428,6060773114960761336], (-6354401100436622515,-6351665356961460262], (3358411934943460772,3363367777663817876], (6255644242745576360,6278718135193665575], (-6321106762570843270,-6316788220143151823], (1754319239259058661,1759314644652031521], (7893062759268864220,7894890594190784729], (-8012293411840276426,-8011781808288431224]]] Validation failed in /192.168.8.64 (progress: 1%)
      [2018-05-25 19:30:38,744] Repair session 01ab16c1-6050-11e8-a3b7-9d0793eab507 for range [(4474598255414218354,4477186372547790770], (-8368931070988054567,-8367389908801757978], (4445104759712094068,4445123832517144036], (6749641233379918040,6749879473217708908], (717627050679001698,729408043324000761], (8984622403893999385,8990662643404904110], (4457612694557846994,4474598255414218354], (5589049422573545528,5593079877787783784], (3609693317839644945,3613727999875360405], (8499016262183246473,8504603366117127178], (-5421277973540712245,-5417725796037372830], (5586405751301680690,5589049422573545528], (-2611069890590917549,-2603911539353128123], (2424772330724108233,2427564448454334730], (3172651438220766183,3175226710613527829], (4445123832517144036,4457612694557846994], (-6827531712183440570,-6800863837312326365], (5593079877787783784,5596020904874304252], (716705770783505310,717627050679001698], (115377252345874298,119626359210683992], (239394377432130766,240250561347730054]] failed with error [repair #01ab16c1-6050-11e8-a3b7-9d0793eab507 on keyspace/table, [(4474598255414218354,4477186372547790770], (-8368931070988054567,-8367389908801757978], (4445104759712094068,4445123832517144036], (6749641233379918040,6749879473217708908], (717627050679001698,729408043324000761], (8984622403893999385,8990662643404904110], (4457612694557846994,4474598255414218354], (5589049422573545528,5593079877787783784], (3609693317839644945,3613727999875360405], (8499016262183246473,8504603366117127178], (-5421277973540712245,-5417725796037372830], (5586405751301680690,5589049422573545528], (-2611069890590917549,-2603911539353128123], (2424772330724108233,2427564448454334730], (3172651438220766183,3175226710613527829], (4445123832517144036,4457612694557846994], (-6827531712183440570,-6800863837312326365], (5593079877787783784,5596020904874304252], (716705770783505310,717627050679001698], (115377252345874298,119626359210683992], (239394377432130766,240250561347730054]]] Validation failed in
      /192.168.8.63 (progress: 1%)
      [2018-05-25 19:31:49,787] Repair session 01a4ae20-6050-11e8-a3b7-9d0793eab507 for range [(-2541759376733803975,-2534654569942446346], (5879245607426320709,5880012885546321040], (-6369551868880447648,-6359409984081717656], (-6599114937188060013,-6597469275333616279], (-5074096572632539578,-5067488659471711472], (-6379754598016153113,-6369551868880447648], (2064405355459946002,2071996664850745669], (-2534654569942446346,-2517719430302560572], (7881309182913674059,7881942012672602731], (-2544088936726049385,-2541759376733803975], (2279496339605311864,2281121064700207175], (7872992433920056063,7881309182913674059], (2062114659748646544,2064405355459946002], (-2150878401005443227,-2148033787477253835], (-1741268532521628862,-1723492194304925672], (-2148033787477253835,-2148008030576152684], (2274175180327961853,2279496339605311864]] failed with error [repair #01a4ae20-6050-11e8-a3b7-9d0793eab507 on keyspace/table, [(-2541759376733803975,-2534654569942446346], (5879245607426320709,5880012885546321040], (-6369551868880447648,-6359409984081717656], (-6599114937188060013,-6597469275333616279], (-5074096572632539578,-5067488659471711472], (-6379754598016153113,-6369551868880447648], (2064405355459946002,2071996664850745669], (-2534654569942446346,-2517719430302560572], (7881309182913674059,7881942012672602731], (-2544088936726049385,-2541759376733803975], (2279496339605311864,2281121064700207175], (7872992433920056063,7881309182913674059], (2062114659748646544,2064405355459946002], (-2150878401005443227,-2148033787477253835], (-1741268532521628862,-1723492194304925672], (-2148033787477253835,-2148008030576152684], (2274175180327961853,2279496339605311864]]] Validation failed in /192.168.8.64 (progress: 1%)
      [2018-05-25 19:31:49,845] Repair session 01c26f52-6050-11e8-a3b7-9d0793eab507 for range [(-6336288442758847669,-6327494039552357362], (-6596499651894591521,-6570651311582753946], (-6597469275333616279,-6596499651894591521], (2057770067222008303,2062114659748646544], (-5870054111151365631,-5835304364517776345], (-3812151910311844467,-3802006636037441627], (-2619800330042834297,-2615481117037091603], (4808940926778034213,4810350864294758856], (-7508256920307222829,-7506372018227268626], (-7104590653728972577,-7104546570237712729], (3158009800098518496,3172651438220766183], (-2615481117037091603,-2611069890590917549], (-5878097817437987368,-5870054111151365631], (-2547658065527858190,-2544088936726049385], (232652608016417486,239394377432130766], (3154311195118940026,3158009800098518496]] failed with error [repair #01c26f52-6050-11e8-a3b7-9d0793eab507 on keyspace/table, [(-6336288442758847669,-6327494039552357362], (-6596499651894591521,-6570651311582753946], (-6597469275333616279,-6596499651894591521], (2057770067222008303,2062114659748646544], (-5870054111151365631,-5835304364517776345], (-3812151910311844467,-3802006636037441627], (-2619800330042834297,-2615481117037091603], (4808940926778034213,4810350864294758856], (-7508256920307222829,-7506372018227268626], (-7104590653728972577,-7104546570237712729], (3158009800098518496,3172651438220766183], (-2615481117037091603,-2611069890590917549], (-5878097817437987368,-5870054111151365631], (-2547658065527858190,-2544088936726049385], (232652608016417486,239394377432130766], (3154311195118940026,3158009800098518496]]] Validation failed in /192.168.10.63 (progress: 1%)
      [2018-05-25 19:31:50,027] Repair session 01b3f061-6050-11e8-a3b7-9d0793eab507 for range [(2424051311739332070,2424772330724108233], (6848066208555197,10521229928033262], (992385332284940308,1000066900542109637], (4418797036920007266,4421783585221695744], (-5417725796037372830,-5412149532100548404], (178766242164281045,191217736969025363], (-3802006636037441627,-3796416071827586080], (5683533739750457455,5688298632819249302], (3653327414143088744,3655860906328373441], (3655860906328373441,3657219071532471378], (5746716543928841040,5753897313199191356], (-7506372018227268626,-7477180353912675682], (1911795960615895165,1921474545637686707], (4421783585221695744,4445104759712094068], (-4428987737460108139,-4413904067417968038], (5680321325075541449,5683533739750457455]] failed with error [repair #01b3f061-6050-11e8-a3b7-9d0793eab507 on keyspace/table, [(2424051311739332070,2424772330724108233], (6848066208555197,10521229928033262], (992385332284940308,1000066900542109637], (4418797036920007266,4421783585221695744], (-5417725796037372830,-5412149532100548404], (178766242164281045,191217736969025363], (-3802006636037441627,-3796416071827586080], (5683533739750457455,5688298632819249302], (3653327414143088744,3655860906328373441], (3655860906328373441,3657219071532471378], (5746716543928841040,5753897313199191356], (-7506372018227268626,-7477180353912675682], (1911795960615895165,1921474545637686707], (4421783585221695744,4445104759712094068], (-4428987737460108139,-4413904067417968038], (5680321325075541449,5683533739750457455]]] Validation failed in /192.168.10.63 (progress: 1%)
      [2018-05-25 19:31:50,065] Repair session 01d226c2-6050-11e8-a3b7-9d0793eab507 for range [(731483217573828589,749016052425471844], (3349217091766639630,3355743728768043539], (8297509817744988677,8299811671851037140], (-1080064213437365415,-1067683134584617984], (-8988387420898594746,-8988256206650322851], (-1083473978088553649,-1080064213437365415], (-7068314886788869981,-7062826172876507507], (8299811671851037140,8306379796303668520], (-8500393685425499630,-8500127431270773970], (9077374236600850244,9080101637323836166], (9080101637323836166,9095536755598180114], (-2759657072078827823,-2750629632199441038], (-7938459356954944009,-7933123149264580832], (1759642905348136701,1772996641768793656], (-2788441126655538224,-2774970527117004032], (-7070810217579746608,-7068314886788869981], (-7959560447639828128,-7938459356954944009], (-7679921498492428955,-7664015662435807775]] failed with error [repair #01d226c2-6050-11e8-a3b7-9d0793eab507 on keyspace/table, [(731483217573828589,749016052425471844], (3349217091766639630,3355743728768043539], (8297509817744988677,8299811671851037140], (-1080064213437365415,-1067683134584617984], (-8988387420898594746,-8988256206650322851], (-1083473978088553649,-1080064213437365415], (-7068314886788869981,-7062826172876507507], (8299811671851037140,8306379796303668520], (-8500393685425499630,-8500127431270773970], (9077374236600850244,9080101637323836166], (9080101637323836166,9095536755598180114], (-2759657072078827823,-2750629632199441038], (-7938459356954944009,-7933123149264580832], (1759642905348136701,1772996641768793656], (-2788441126655538224,-2774970527117004032], (-7070810217579746608,-7068314886788869981], (-7959560447639828128,-7938459356954944009], (-7679921498492428955,-7664015662435807775]]] Validation failed in /192.168.8.63 (progress: 2%)
      [2018-05-25 19:32:24,797] Repair session 01aff8c0-6050-11e8-a3b7-9d0793eab507 for range [(119626359210683992,128454334208965433], (6169854579148936152,6189260921105966960], (8460580156771389602,8466680988634247357], (10521229928033262,11278848941988721], (6165215300562655515,6169854579148936152], (191217736969025363,212964375650430729], (-5297146550802223153,-5294434130239676253], (6189260921105966960,6193074220809370652], (-655425716305023073,-647730635946823030]] failed with error [repair #01aff8c0-6050-11e8-a3b7-9d0793eab507 on keyspace/table, [(119626359210683992,128454334208965433], (6169854579148936152,6189260921105966960], (8460580156771389602,8466680988634247357], (10521229928033262,11278848941988721], (6165215300562655515,6169854579148936152], (191217736969025363,212964375650430729], (-5297146550802223153,-5294434130239676253], (6189260921105966960,6193074220809370652], (-655425716305023073,-647730635946823030]]] Validation failed in /192.168.10.63 (progress: 2%)
      [2018-05-25 19:32:24,873] Repair session 0199d8b1-6050-11e8-a3b7-9d0793eab507 for range [(2708724319719658573,2710986923384204956], (6278718135193665575,6281813004301666161], (-8025315476660819134,-8015410683496661099], (2516704840921371424,2519633614752918103], (2519633614752918103,2526922953145276348], (8641102301927501454,8641256970223193109], (8643632109719583963,8645181823655307237], (-8015410683496661099,-8012293411840276426], (1368548173174048881,1373330457443776421], (5550121777767121,6848066208555197], (8641256970223193109,8643632109719583963], (-4201893423037098789,-4196287665648271477], (2692054381245703566,2708724319719658573], (-4208139091663389178,-4201893423037098789], (6281813004301666161,6282606461503930756], (-3470325001213070915,-3465759276556337455], (-4196287665648271477,-4185162268982289501], (-5006305410789315624,-5000646423000423501], (2714363942918413158,2722577239100121227], (5692402142504566885,5693342630493279303], (2710986923384204956,2714363942918413158], (5688298632819249302,5692402142504566885]] failed with error [repair #0199d8b1-6050-11e8-a3b7-9d0793eab507 on keyspace/table, [(2708724319719658573,2710986923384204956], (6278718135193665575,6281813004301666161], (-8025315476660819134,-8015410683496661099], (2516704840921371424,2519633614752918103], (2519633614752918103,2526922953145276348], (8641102301927501454,8641256970223193109], (8643632109719583963,8645181823655307237], (-8015410683496661099,-8012293411840276426], (1368548173174048881,1373330457443776421], (5550121777767121,6848066208555197], (8641256970223193109,8643632109719583963], (-4201893423037098789,-4196287665648271477], (2692054381245703566,2708724319719658573], (-4208139091663389178,-4201893423037098789], (6281813004301666161,6282606461503930756], (-3470325001213070915,-3465759276556337455], (-4196287665648271477,-4185162268982289501], (-5006305410789315624,-5000646423000423501], (2714363942918413158,2722577239100121227], (5692402142504566885,5693342630493279303], (2710986923384204956,2714363942918413158], (5688298632819249302,5692402142504566885]]] Validation failed in /192.168.8.65 (progress: 2%)
      Exception occurred during clean-up. java.lang.reflect.UndeclaredThrowableException
      Cassandra has shutdown.
      error: [2018-05-25 19:36:47,652] JMX connection closed. You should check server log for repair status of keyspace keyspace(Subsequent keyspaces are
      not going to be repaired).
      -- StackTrace --
      May 25, 2018 7:36:47 PM ClientCommunicatorAdmin Checker-run
      WARNING: Failed to check connection: java.net.SocketException: Connection reset
      java.io.IOException: [2018-05-25 19:36:47,652] JMX connection closed. You should check server log for repair status of keyspace keyspace(Subsequent
      keyspaces are not going to be repaired).
              at org.apache.cassandra.tools.RepairRunner.handleConnectionFailed(RepairRunner.java:98)
              at org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:86)
              at javax.management.NotificationBroadcasterSupport.handleNotification(NotificationBroadcasterSupport.java:275)
              at javax.management.NotificationBroadcasterSupport$SendNotifJob.run(NotificationBroadcasterSupport.java:352)
              at javax.management.NotificationBroadcasterSupport$1.execute(NotificationBroadcasterSupport.java:337)
              at javax.management.NotificationBroadcasterSupport.sendNotification(NotificationBroadcasterSupport.java:248)
              at javax.management.remote.rmi.RMIConnector.sendNotification(RMIConnector.java:441)
              at javax.management.remote.rmi.RMIConnector.access$1200(RMIConnector.java:121)
              at javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.gotIOException(RMIConnector.java:1531)
              at com.sun.jmx.remote.internal.ClientCommunicatorAdmin$Checker.run(ClientCommunicatorAdmin.java:199)
              at java.lang.Thread.run(Thread.java:748)
      
      May 25, 2018 7:36:47 PM ClientCommunicatorAdmin Checker-run
      WARNING: stopping
      

      Here is the log on one of the nodes where validation fails.

      INFO  [AntiEntropyStage:1] 2018-05-25 19:23:10,548 Validator.java:281 - [repair #01cf67a1-6050-11e8-a3b7-9d0793eab507] Sending completed merkle tree to /192.168.10.65 for pr$
      INFO  [AntiEntropyStage:1] 2018-05-25 19:26:17,161 Validator.java:281 - [repair #01828020-6050-11e8-a3b7-9d0793eab507] Sending completed merkle tree to /192.168.10.65 for pr$
      INFO  [AntiEntropyStage:1] 2018-05-25 19:26:23,909 Validator.java:281 - [repair #019dd051-6050-11e8-a3b7-9d0793eab507] Sending completed merkle tree to /192.168.10.65 for pr$
      INFO  [AntiEntropyStage:1] 2018-05-25 19:28:15,118 Validator.java:281 - [repair #01c52e71-6050-11e8-a3b7-9d0793eab507] Sending completed merkle tree to /192.168.10.65 for pr$
      INFO  [GossipTasks:1] 2018-05-25 19:30:23,087 Gossiper.java:1034 - InetAddress /192.168.10.65 is now DOWN
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:30:31,093 OutboundTcpConnection.java:560 - Handshaking version with /192.168.10.65
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:30:31,281 OutboundTcpConnection.java:560 - Handshaking version with /192.168.10.65
      INFO  [RequestResponseStage-4] 2018-05-25 19:30:31,320 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [RequestResponseStage-3] 2018-05-25 19:30:31,320 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [RequestResponseStage-2] 2018-05-25 19:30:31,320 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [RequestResponseStage-1] 2018-05-25 19:30:31,320 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [RequestResponseStage-5] 2018-05-25 19:30:31,320 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [AntiEntropyStage:1] 2018-05-25 19:30:49,172 Validator.java:281 - [repair #01860291-6050-11e8-a3b7-9d0793eab507] Sending completed merkle tree to /192.168.10.65 for pr$
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:30:49,188 OutboundTcpConnection.java:560 - Handshaking version with /192.168.10.65
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:30:54,188 OutboundTcpConnection.java:569 - Cannot handshake version with /192.168.10.65
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:30:54,188 OutboundTcpConnection.java:560 - Handshaking version with /192.168.10.65
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:30:59,188 OutboundTcpConnection.java:569 - Cannot handshake version with /192.168.10.65
      INFO  [GossipTasks:1] 2018-05-25 19:31:03,247 Gossiper.java:1034 - InetAddress /192.168.10.65 is now DOWN
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:31:10,250 OutboundTcpConnection.java:560 - Handshaking version with /192.168.10.65
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:31:12,237 OutboundTcpConnection.java:560 - Handshaking version with /192.168.10.65
      INFO  [RequestResponseStage-7] 2018-05-25 19:31:12,712 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [RequestResponseStage-9] 2018-05-25 19:31:12,712 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [RequestResponseStage-13] 2018-05-25 19:31:12,712 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [GossipTasks:1] 2018-05-25 19:31:37,252 Gossiper.java:1034 - InetAddress /192.168.10.65 is now DOWN
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:31:45,254 OutboundTcpConnection.java:560 - Handshaking version with /192.168.10.65
      INFO  [HANDSHAKE-/192.168.10.65] 2018-05-25 19:31:48,759 OutboundTcpConnection.java:560 - Handshaking version with /192.168.10.65
      ERROR [ValidationExecutor:7] 2018-05-25 19:31:49,021 Validator.java:268 - Failed creating a merkle tree for [repair #01c26f52-6050-11e8-a3b7-9d0793eab507 on keyspace/$
      ERROR [ValidationExecutor:7] 2018-05-25 19:31:49,022 CassandraDaemon.java:228 - Exception in thread Thread[ValidationExecutor:7,1,main]java.lang.RuntimeException: Parent repair session with id = 0103da40-6050-11e8-a3b7-9d0793eab507 has failed.        at org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:412) ~[apache-cassandra-3.11.2.jar:3.11.2]        at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1459) ~[apache-cassandra-3.11.2.jar:3.11.2]        at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1366) ~[apache-cassandra-3.11.2.jar:3.11.2]        at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:86) ~[apache-cassandra-3.11.2.jar:3.11.2]        at org.apache.cassandra.db.compaction.CompactionManager$13.call(CompactionManager.java:955) ~[apache-cassandra-3.11.2.jar:3.11.2]        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_171]        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_171]        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171]        at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.2.jar:3.11.2]        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_171
      ]INFO  [RequestResponseStage-2] 2018-05-25 19:31:49,025 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [RequestResponseStage-3] 2018-05-25 19:31:49,025 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      INFO  [RequestResponseStage-1] 2018-05-25 19:31:49,039 Gossiper.java:1019 - InetAddress /192.168.10.65 is now UP
      ERROR [ValidationExecutor:7] 2018-05-25 19:31:49,817 Validator.java:268 - Failed creating a merkle tree for [repair #01b3f061-6050-11e8-a3b7-9d0793eab507 on keyspace/$
      ERROR [ValidationExecutor:7] 2018-05-25 19:31:49,817 CassandraDaemon.java:228 - Exception in thread Thread[ValidationExecutor:7,1,main]java.lang.RuntimeException: Parent repair session with id = 0103da40-6050-11e8-a3b7-9d0793eab507 has failed.        at org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:412) ~[apache-cassandra-3.11.2.jar:3.11.2]        at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1459) ~[apache-cassandra-3.11.2.jar:3.11.2]        at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1366) ~[apache-cassandra-3.11.2.jar:3.11.2]        at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:86) ~[apache-cassandra-3.11.2.jar:3.11.2]        at org.apache.cassandra.db.compaction.CompactionManager$13.call(CompactionManager.java:955) ~[apache-cassandra-3.11.2.jar:3.11.2]        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_171]        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_171]        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171]        at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.2.jar:3.11.2]        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_171]
      

      192.168.10.65 is the node where I started the repair. It looks like this node goes down before the merkle tree creation failure occurs? The debug log on the repair node is full of the below and doesn't help me much.

      DEBUG [RepairJobTask:11] 2018-05-25 19:25:49,646 MerkleTree.java:295 - (17) Hashing sub-ranges [#<TreeRange (8732300281801533308,8732300339552037321] depth=18>, #<TreeRange (8732300339552037321,8732300397302541334] depth=18>] for #<TreeRange (8732300281801533308,8732300397302541334] depth=17> divided by midpoint 8732300339552037321
      DEBUG [RepairJobTask:11] 2018-05-25 19:25:49,647 MerkleTree.java:311 - (17) Inconsistent digest on left sub-range #<TreeRange (8732300281801533308,8732300339552037321] depth=18>: [#<Leaf [16cd9a47184232c7ed028a9d4546332d8a8c8ff83526f8f274997592eecc722d]>, #<Leaf [97b8f2fd61c1130ed2cfd8c2db52afdea59b2cabb5f39aed967cf8f2539f08b8]>]
      DEBUG [RepairJobTask:11] 2018-05-25 19:25:49,647 MerkleTree.java:333 - (17) Inconsistent digest on right sub-range #<TreeRange (8732300339552037321,8732300397302541334] depth=18>: [#<Leaf [fcf6daa5b5124a1e099e7776475aff22f2befedd88dc7b3e4277b92fd3115833]>, #<Leaf []>]
      DEBUG [RepairJobTask:11] 2018-05-25 19:25:49,647 MerkleTree.java:346 - (17) Fully inconsistent range [#<TreeRange (8732300281801533308,8732300339552037321] depth=18>, #<TreeRange (8732300339552037321,8732300397302541334] depth=18>]
      DEBUG [RepairJobTask:11] 2018-05-25 19:25:49,647 MerkleTree.java:346 - (16) Fully inconsistent range [#<TreeRange (8732300166300525283,8732300281801533308] depth=17>, #<TreeRange (8732300281801533308,8732300397302541334] depth=17>]
      DEBUG [RepairJobTask:11] 2018-05-25 19:25:49,647 MerkleTree.java:346 - (15) Fully inconsistent range [#<TreeRange (8732299935298509232,8732300166300525283] depth=16>, #<TreeRange (8732300166300525283,8732300397302541334] depth=16>]
      DEBUG [RepairJobTask:11] 2018-05-25 19:25:49,647 MerkleTree.java:346 - (14) Fully inconsistent range [#<TreeRange (8732299473294477131,8732299935298509232] depth=15>, #<TreeRange (8732299935298509232,8732300397302541334] depth=15>]
      DEBUG [RepairJobTask:11] 2018-05-25 19:25:49,647 MerkleTree.java:346 - (13) Fully inconsistent range [#<TreeRange (8732298549286412929,8732299473294477131] depth=14>, #<TreeRange (8732299473294477131,8732300397302541334] depth=14>]
      

      Really at a loss of how to repair this table at this point.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ozzieisaacs Harry Hough
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: