Nodes sending invalid p2p messages to the network

We will block nodes that are sending invalid p2p messages to the network to protect the network.

One important protection is related to nodes with multiple blskey installed. If any of your blskey that is not in the current committee, your node may be blocked in the network, if the message is signed by non-active keys. So, if you are managing the node actively with multiple blskeys addition/removal, it’s better to remove the non-active blskeys from the node you run.

Our latest code since v2.1.8 has excluded the non-active blskey from signing the message, but if the validator is running some legacy version or customized build, the validator may still sign the message with non-active blskeys. In this case, the valid blskeys in the same node won’t sign the consensus blocks, thus not earning block rewards.

2 Likes

We’ve noticed there are nodes constantly sending messages to the network signed by the keys not in the current committee/epoch. So, if we implement/deploy this change, you’d better remove the non-active blskeys from your node, ex, .hmy/blskeys directory.

1 Like

Hey Leo,

Is the blocking done directly when these packages are seen or is there some cool off period?
I’m asking because might be the case right after epoch change there will be leftover BLSkeys…

Regards,
JB273

By default, with the latest code, the node won’t sign the message with the key that is not in the current committee/epoch. This is designed to prevent nodes intentionally signing and sending invalid messages to spam the network.

There is a cool off period, like a few minutes. However, as I pointed out below, for valid node/code, this is not an issue, as the node won’t sign it with non-active keys. But to be safe, and for actively managed nodes, it is better to manage the keys accordingly.

Noted, signed by non-active blskeys is just one kind of p2p spamming, we will also blocklist the node sending invalid p2p messages as a way to protect the network from spammers.

1 Like

Unfortunate. This makes running a validator more time consuming and more complicated… the exact opposite of the direction we need to be going.

Would it be possible to extend the Epoch time also, so that running a node is less time consuming?

When will this take effect?

1 Like

I agree with Ogre. As is, working with the epochs and ensuring smooth transition is taxing/time consuming and requires working at odd hours. With this change, it would mean more steps whenever BLS keys need to be adjusted. I understand the issue being raised here and support whatever is required in the short term. I do however strongly suggest that automated BLS key management needs to be a first class tool as part of the Harmony created binaries. I know many large validators have created their own automation, why not generalize and empower all validators so they have an equal footing.

1 Like

I think there can be implementations that don’t require user changes: just to filter out BLS signing keys not in committee when sending consensus. What’s your idea?

1 Like

The solution presented is likely the best one.

Working on the assumption that the extra BLS keys on a node, that aren’t included in the validator, are still signing and spamming the network:

Updating the protocol will take considerable time, while educating the Validators is far more effective.

Yes, this will add maintenance time for Validators, but the alternative would be countless hours spent be the team reworking the code. Testing. Tweaking. Testing again.

It could be weeks before it is ready to roll out. Having Validators not run extra keys on their nodes make sense, given the situation.

I would simply ask that we lengthen the epoch time, so that we have more days of full sleep before the next middle of the night epoch.

Hi @leo, I have a question regarding the "not elected validators, that are currently running a node with bls keys that are not in the committee, are they also affected?

If I understand right, this solves the problem that Ogre and Smartstake are talking about right? With this, we can run our node with a fixed number of BLS and just adding/removing from the validator at our convienience, without spamming the network with those BLS who are in the node but not part of the validator. (assuming we have updated our node to the latest version)

1 Like

Hey Leo, no issue for me with the proposal. Just few questions:

  1. what is the process for a node to be unblock ?
  2. How do we know if the node is blocked ? any logs we need to look out for ?

this one will be challenging and would invite single node runner to lose some sig during the restart of the node so for me not really something I would advice. We need to be sure 2.1.8 is not banning the node.

Need to think about a way to simplify this long term. Running a node needs to get simpler not more complex

2 Likes

Check out VCs dumped 315 mill ones from june first. From Harmony team wallet to VC’s wallet to Binance. :rofl::rofl::rofl::sweat_smile::sweat_smile::sweat_smile:

For “not elected validators”, they shouldn’t sign any consensus messages, then they are fine.

Soph,

The block will be a timed block for a few minutes. We will try to publish some info if we found a node is blocked. We can add an API though.

Agree with other users. With short epoch times / constantly changing median stake, the current requirements to manage a validator are already above average comparing to other projects; thus adding more steps is not desirable.

If the burden is to fall on the validator to manage this there should be a) a grace period before blocking b) valid monitoring/alerting - either in the block explorer, or at the very least a script written by the team that validators can run locally to assess whether or not they are required to make changes.
(there are several projects that utilise telegram bots that alert on failure) c) a change made in the logging that is more granular than ‘sending invalid p2p messages’ - so validators can setup custom alerts against this specific log entry.

Beyond that before such a change should take place there should be valid rules for a) how long a node is suspended b) steps required to rejoin consensus.

The most preferable action is obviously for the team to improve on the startup sequence so a validator can autodetect which keys are currently added/removed. Why not add an entry to the code so that when you --remove-bls-key it automatically mv’s from .hmy/blskeys to a temporary directory in ~ , and when you --add-bls-key it moves it back? (this will only cover the scenario where a user is running the keys locally but its a start).

Hey @leo_hao :wave:,

Can an automation be implemented at protocol level and remove add keys, how much work and time will require this :thinking:… Thanks Leo :ok_hand: