Weekly on-call Summary August 31 - September 6, 2021

Weekly on-call Summary

August 31 - September 6, 2021
@giv

Summary

  • “Beacon out of sync” alerts are too noisy. We may need to reduce sensitivity since most auto-resolve within minutes.
  • Several disk space alerts.

Aug 31, 2021

  • Node Disk Space Alert- mainnet shard0 node (54.149.184.237)
    • Giv: ebs volume was only 8gb. Increased to 512gb.

Sep 2, 2021

  • Node Disk Space Alert - mainnet shard0 node (54.149.184.237)
    • Giv: This time it is on /data - this requires instance upgrades That Jack and Soph resolved by upgrading instance.

  • Node Disk Space Alert - the free space of the mainnet shard0 node (34.216.159.65)
  • Giv: Happened along with 2 other disk space incidents at the same time on /data

  • Node Disk Space Alert- the free space of the mainnet shard0 node (54.189.61.183)
    • Giv: same with this one on /data

Sep 5, 2021

  • 44.225.100.239:9500 out of sync! - mainnet
    Giv: Restarted process. Disk space is full. Jack deleted old archive files to make room

Sep 6, 2021

  • Node Disk Space Alert - the free space of the mainnet shard3 node(34.227.78.124) abnormal
    • Giv: Increased ebs volume