Weekly OnCall Rotation
OnCall US Pacific: Leo (leo@harmony.one)
OnCall Asia/EMEA: Yuriy (yuriy@harmony.one)
Duration: 09/28 8:30am - 10/05 8:30am PST
Summary
- Quiet week with two mainnet upgrade/downgrade
Details
09/27:
- Temporary upgrade of s0 nodes to block some addresses, and downgraded back to 4.2.1 again to unblock the addresses
09/30:
- One explorer node is out of space. No warning/paging on this node. PagerDuty
- Found 3 m5d.2xlarge nodes are OOS. Taking them out of service. They have caused api.harmony.one service interruption
10/04:
- Mainnet upgrade v4.3.0 smoothly
- Two snapshot nodes OOS, upgraded to next level of nodes in lightsail
- Node Disk Space Alert - the free space of the mainnet shard0 node(34.216.159.65) abnormal
- Soph has upgraded two node instances above from 8G to 20 G
Takeaways:
- Take nodes offline of ELB asap if they are put into maintenance mode, to avoid service interruption on RPC nodes