Cassandra - New node bootstrapped - Not compacting -


i had 21 node cluster (c* 2.2) of m4.2xlarges, each 5 volumes of 1tb ssds.

when 50% full (each node @ 500gb * 5 = 2.5 tb), realised needed more space added new node.

this new node joined cluster (from uj un), disk usage @ 4.2tb.

i figured due compactions lagging behind , waited few days. disk usage did not change though there compactions taking place. new box cpu bound, bumped compute optimised c4.8xlarge box , cranked concurrent_compactions 20 , disabled compaction_throughput throttling done.

in mean time stopped writes cluster. # of pending compactions going , , data on disk not going down.

what doing wrong? system time looks high. using org.apache.cassandra.db.compaction.sizetieredcompactionstrategy , current compaction thresholds min = 4, max = 32

when strace -f -c -p cassandra-pid > strace_count:

    % time     seconds  usecs/call     calls    errors syscall ------ ----------- ----------- --------- --------- ----------------  49.57 7431.363672      140392     52933     17755 futex  30.22 4530.012667      482481      9389           epoll_wait  11.33 1697.685882     2143543       792           recvfrom   3.68  551.306817        1596    345500         7 write   3.58  537.257283    14138350        38        33 restart_syscall   0.78  117.381206      111262      1055           poll   0.28   41.738677         636     65675           lseek   0.14   21.138626        1659     12741           pread   0.10   15.189009        1838      8265           read   0.07    9.898101         696     14229           sched_yield   0.06    8.984107       23831       377           sendto   0.04    6.148230        9759       630           munmap   0.04    5.760339       21902       263           mprotect   0.02    3.154839         992      3181       359 fadvise64   0.02    3.107529         652      4769       215 stat   0.01    2.006363      167197        12           msync   0.01    1.956998        7040       278           mmap   0.01    1.838682        1155      1592         8 unlink   0.01    1.080512         602      1794           lstat   0.01    0.861741         578      1490           close   0.00    0.626903         562      1116           open   0.00    0.596450         588      1014           fcntl   0.00    0.440250         644       684           fstat   0.00    0.318874         630       506           epoll_ctl   0.00    0.249772        4625        54           fdatasync   0.00    0.149440        1660        90           fsync   0.00    0.093154         647       144           rename   0.00    0.069017         575       120           statfs   0.00    0.018136         356        51           getpriority   0.00    0.014358         598        24           rt_sigprocmask   0.00    0.011584         161        72           times   0.00    0.009858         616        16           setsockopt   0.00    0.009396         940        10           link   0.00    0.008072          24       336         7 rt_sigreturn   0.00    0.004960        1240         4           getsockopt   0.00    0.004926         411        12           sched_getaffinity   0.00    0.004503         500         9           dup2   0.00    0.002998         500         6           madvise   0.00    0.002693         449         6           set_robust_list   0.00    0.002597         433         6           accept   0.00    0.002000         333         6           clone   0.00    0.002000         500         4         2 accept4   0.00    0.001243         207         6           gettid   0.00    0.001000         500         2           writev   0.00    0.001000         500         2           recvmsg   0.00    0.001000         143         7           getsockname   0.00    0.001000         500         2           getpeername   0.00    0.001000         167         6         6 setpriority   0.00    0.000000           0         1           socket   0.00    0.000000           0         1           bind ------ ----------- ----------- --------- --------- ---------------- 100.00 14990.519464                529320     18392 total 

when top - 1:

tasks: 1506 total,   8 running, 1496 sleeping,   0 stopped,   2 zombie cpu0  :  0.3%us, 47.3%sy, 10.5%ni, 41.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu1  :  0.7%us, 87.6%sy, 11.7%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu2  :  3.2%us, 65.0%sy,  0.0%ni, 31.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu3  : 11.6%us, 39.9%sy,  0.0%ni, 48.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu4  :  1.0%us, 55.3%sy,  9.2%ni, 34.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu5  :  0.3%us, 98.0%sy,  1.7%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu6  :  0.4%us, 90.7%sy,  1.4%ni,  6.8%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st cpu7  :  3.4%us, 20.2%sy,  9.4%ni, 64.0%id,  3.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu8  :  1.7%us, 24.9%sy,  0.3%ni, 73.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu9  :  0.7%us, 79.4%sy,  0.7%ni, 18.9%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st cpu10 :  0.7%us, 64.9%sy, 13.6%ni, 14.0%id,  6.8%wa,  0.0%hi,  0.0%si,  0.0%st cpu11 :  1.0%us, 50.7%sy,  0.0%ni, 18.6%id, 29.7%wa,  0.0%hi,  0.0%si,  0.0%st cpu12 :  0.3%us, 58.9%sy,  0.0%ni, 40.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu13 :  0.3%us, 72.5%sy, 26.8%ni,  0.0%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st cpu14 :  0.0%us, 50.2%sy, 49.8%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu15 :  0.0%us,100.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu16 :  0.3%us, 54.2%sy,  0.0%ni, 40.5%id,  5.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu17 :  0.7%us, 46.3%sy, 19.9%ni, 24.0%id,  9.1%wa,  0.0%hi,  0.0%si,  0.0%st cpu18 :  0.7%us, 68.9%sy,  0.0%ni, 30.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu19 :  5.7%us,  3.4%sy,  0.0%ni, 90.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu20 :  0.7%us, 44.4%sy,  0.0%ni, 54.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu21 :  1.3%us, 67.8%sy,  0.0%ni, 30.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu22 :  0.7%us, 45.5%sy,  7.3%ni, 42.9%id,  3.6%wa,  0.0%hi,  0.0%si,  0.0%st cpu23 :  1.3%us, 22.7%sy,  0.0%ni, 75.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu24 :  0.0%us, 65.4%sy,  0.0%ni, 34.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu25 :  0.0%us, 62.0%sy, 12.2%ni, 25.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu26 :  1.3%us, 68.9%sy, 12.6%ni, 17.2%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu27 :  0.0%us, 64.3%sy, 12.9%ni, 22.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu28 :  0.0%us, 75.8%sy,  0.0%ni, 23.5%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st cpu29 :  0.0%us, 60.3%sy,  1.7%ni, 37.4%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st cpu30 :  0.3%us, 48.3%sy, 12.7%ni, 38.0%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st cpu31 :  0.0%us,100.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu32 :  0.0%us, 72.1%sy, 25.2%ni,  0.0%id,  2.7%wa,  0.0%hi,  0.0%si,  0.0%st cpu33 :  0.0%us,100.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu34 :  0.3%us, 66.7%sy,  0.0%ni, 33.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st cpu35 :  0.0%us, 67.7%sy,  0.0%ni, 32.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st mem:  61820728k total, 61610932k used,   209796k free,      456k buffers swap:        0k total,        0k used,        0k free, 35425968k cached 

nodetool compactionstats

      pending tasks: 281   id   compaction type     keyspace               table     completed         total    unit   progress   id        compaction   keyspace_1         table_____4    1591902797    2851758523   bytes     55.82%   id        compaction   keyspace_1         table_____1     193582898     567222689   bytes     34.13%   id        compaction   keyspace_1         table_____2     187022078    2264168754   bytes      8.26%   id        compaction   keyspace_1         table_____1   22841754587   24781014960   bytes     92.17%   id        compaction   keyspace_1         table_____5     764633368    3904191508   bytes     19.58%   id        compaction   keyspace_1         table_____1    1856076066    2326634436   bytes     79.78%   id        compaction   keyspace_1         table_____7     254856804     499133271   bytes     51.06%   id        compaction   keyspace_1         table_____8    1406859449    1803885628   bytes     77.99%   id        compaction   keyspace_1         table_____7    1734201253    2308801656   bytes     75.11%   id        compaction   keyspace_1         table_____1     656195289     931867447   bytes     70.42%   id        compaction   keyspace_1         table_____1     657036608    1380870812   bytes     47.58%   id        compaction   keyspace_1         table_____1     235054945   18957522878   bytes      1.24%   id        compaction   keyspace_1         table____10       2351049       3552009   bytes     66.19%   id        compaction   keyspace_1         table_____2     810635522     867307196   bytes     93.47%   id        compaction   keyspace_1         table_____5     281573682     780375396   bytes     36.08%   id        compaction   keyspace_1         table_____6    2350396501    2398745060   bytes     97.98%   id        compaction   keyspace_1         table_____1      63122362     434443651   bytes     14.53%   id        compaction   keyspace_1         table_____3     287859748     399896319   bytes     71.98%   id        compaction   keyspace_1         table_____2    1776310557    2685522257   bytes     66.14%   id        compaction   keyspace_1         table_____1     494183426   22432529013   bytes      2.20% 

nodetool compactionhistory: there lot of lines here, here sample:

id  datatype index        1492056758751             558756         540336         {1:175, 2:6} id  datatype index     1492075503279             128269         114446         {1:1160, 2:31} id  datatype index     1492072165446             22914902       22464994       {1:626, 2:37} id  datatype index   1492060375419             73514456       72842367       {1:398795, 2:7294, 3:300} id  datatype index    1492075160893             85707          64387          {1:236, 2:41} id  datatype index      1492151303774             139172156      134666782      {1:9129, 2:3313, 3:935, 4:112} id  datatype index    1492135037619             30839157       29690968       {1:32854, 2:5702, 3:535, 4:61} id  datatype index   1492075521048             255030         253531         {1:220, 2:6} id  datatype index        1492116936213             11391100       10943344       {1:6798, 2:301} id  datatype index    1492075649703             1527580        1486442        {1:5381, 2:330} id  datatype index          1492153054713             218401839      216306589      {1:6669, 2:1068, 3:273, 4:22} id  datatype index   1492169550324             9172160        8724129        {1:42943, 2:2390} id  datatype index   1492087845445             8086487        7810261        {1:8445, 2:1209, 3:95} id  datatype index    1492116806390             837169         806946         {1:5984, 2:262} id  datatype index   1492167939189             275277987      271618327      {1:38585, 2:18745, 3:494} id  datatype index             277471932      266321389      {1:47184, 2:16047, 3:367, 4:468} id  datatype index        1492116559239             1569590        1402724        {1:460, 2:62} id  datatype index 1492173763782             83298080       81977056       {1:36383, 2:7577, 3:3565, 4:95, 6:169} id  datatype index      1492158247355             42660621       40224352       {1:6565, 2:987, 3:316, 4:521, 6:17, 8:70} id  datatype index      1492179061558             589874248      568901949      {1:16726, 2:9342, 3:1149, 4:141} id  datatype index        1492190014331             807975203      786973389      {1:67311, 2:1852} id  datatype index      1491949569125             45499223       46212100       {1:3944, 2:523, 3:1268, 4:262} id  datatype index   1492063798113             2401           1134           {1:1, 2:3} id  datatype index   1492100603829             7693737        7507021        {1:7112, 2:870, 3:235, 4:27} id  datatype index 1492202653921             114122963      111721885      {1:2038, 2:2997, 3:1095, 4:48, 5:40} id  datatype index      1492063653695             60700          50728          {1:157, 2:12} id  datatype index 1492152115922             165656033      159591156      {1:5180, 2:3233, 3:600, 4:564, 5:37, 6:14, 7:12} id  datatype index   1492160511587             3353867375     3280857307     {1:12265239, 2:409303, 3:16391, 4:1932} id  datatype index        1492116638632             3226315        2863672        {1:956, 2:137} id  datatype index 1492050334458             64407          56620          {1:447, 2:31} id  datatype index      1492150640640             587181         424081         {1:1293, 2:218, 3:1} id  datatype index   1492116731210             429668507      407404356      {1:2208562, 2:131875, 3:338} id  datatype index 1492134210449             293003702      275992426      {1:7429, 2:1686, 3:165} id  datatype index    1492171984560             8467649        8318775        {1:13330, 2:892, 3:11} id  datatype index    1492150632348             424314         368270         {1:356, 2:72, 3:8} id  datatype index          1492068676918             677842865      653983357      {1:11042, 2:405} id  datatype index 1492160695008             11985228       11689655       {1:3684, 2:1390, 3:441, 4:87} id  datatype index             5906438        5731218        {1:7040, 2:445, 3:27} id  datatype index        1492132529903             234019313      220261439      {1:80014, 2:5316} id  datatype index             1646302        1634070        {1:575, 2:17, 3:5} id  datatype index   1492145903652             1544764        1527844        {1:1807, 2:295, 3:65, 4:3, 5:6} id  datatype index   1492075180569             1034277        986605         {1:6591, 2:235} id  datatype index   1491928723944             5823014        5811907        {1:6498} id  datatype index    1492075323943             573147         526857         {1:4395, 2:250} 

your new node should close compaction gap eventually...

cpu not bound in compactions, check compaction_throughput_mb_per_sec param, , review article: https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsconfigurecompaction.html

please review nodetool compactionstats, , see if number of pending tasks decreasing on time. also, please attach output of nodetool cfstats here.

as alternative, can try re-add new node, auto_bootstrap off, , running nodetool rebuild afterwards, , repair lately, should faster in case.

edit:

after reviewing compactionstats - try decrease concurrent_compactors property lower value. take more time execute, should less impact on overall cluster performance.


Comments

Popular posts from this blog

php - Permission denied. Laravel linux server -

google bigquery - Delta between query execution time and Java query call to finish -

python - Pandas two dataframes multiplication? -