[Shipped] A Multi-View Solution for PBFT Protocol

Weekly update 5 - 11 April ( in milestone “Integration multiview with other core features”)

  • This week we continue to review the integration with other core features and are waiting QC team to test all flows.

  • In the meantime, we prepare for the next milestone ("run multiview with consensus v1 on testnet) on 30 April. We are running a fullnode, and 8 shard nodes on testnet. These nodes are syncing old blocks and we are following their process to capture any issues. Currently, we found and solved issues related to database and race condition.

Next week, we will watch for any issues related to consensus protocol after these nodes synced.

4 Likes

Hi @0xkumi,

Next week, the team of Portal v2 needs to be reviewed 2nd for updating of custodian reward. So your team can use this time to review and test on local with the tester to make sure everything is ready for testnet. But your team does not worry about that. We can branch a new one from master-temp-B-deploy-db-v2 and merge your code into, in this way, we have good branch for testing your all features.

Sorry for this inconvenience

5 Likes

Weekly update 11 - 17 April ( in milestone “Integration multiview with other core features”)

  • This week the QC team started to test our code with several flows (transaction, stake, bridge, pdex, fault tolerance). There is issue with bridge feature and we are looking at this problem (may be because of handling wrong chain view). In additional, we found that there is loop hole in cross chain interaction features (bridge, portal) when chain reverting happens.

  • We also added time metric utility so that we can report how much time a feature process consume. (thanks to @hungngo )

  • We are implementing a feature on Highway that deliberately make fork case based on predefined scenario (@trungtin2qn1, @hyng will work on this for 2 weeks)

Next week, we focus on working with interoperability team to solve reverting issue.

7 Likes

Hi team,

Sorry about reviewing portal v2 (2nd reviewing for some issue reward custodian )late on last week base on portal v2 so that your team merged portal v2 also late. After reviewing code for portal v2. I see that portal team base a lot on the block by height
image

That means some business logic of them will conflict with the multi-view of this solution. Because they will use the wrong final state of the view. I think you guys will need to work together to get suggestions on how to make it work here. As @0xkumi said, portal v2 needs to wait for choosing the last state or process revert without waiting for the last state when forking. It needs you guys to ask and meeting for results.

This week, I will continue review on multi-view solution code(no portal v2 merging). I will give feedback later. Sorry for this inconvenience.

Thanks

5 Likes

thanks @thaibao for your notice, other feature may counter this problem as well, so plz check this twice whenever you merge code. tks

2 Likes

Hi @team

I’m pre-reviewing base on this pull request. With 158 file and 188 commits, I think that I need to spend a lot of days to do it :smile:. We know that this is a pre-review for multi-view solution coding, not final coding because you guys need to merge portal v2 for a finalized pull request. I will send some reviewing code for 1st fixing:

1/ We can remove case data := <-cm.data here, right?
image
because this function is no longer used
image

2/ shardID come from RPC in some case. So I think we need to check index of slice and nil pointer exception
image
because if RPC is invalid, the node will be crashed

3/ This function always return 0 -> remove it?
image
and review where we use it
image

4/ Singleton in beacon pool
image
We can use blockchainObj as a field in BeaconPool singleton, right? Because any node also needs to get beacon, singleton beacon pool will handle this blockchainObj and shardpool can reuse it

5/ Many interface of server object is no longer used
image
image
image
can we remove all of them?

6/ I think 3 functions are not used, right?
image

7/ Remove this condition
image
image

9 Likes
  • RPC must check this is parameter, if not it’s very carelessness
  • Because in paramter is byte. It’s a value type range from 0->255 so it can’t be nil in this case. And also there’s a condition checking BestView is exist or not before getting it. So i think this function is safe
3 Likes

As i recalled beaconpool/shardpool/crossshardpool is outdated, no longer in use. @0xkumi plz very my answer

3 Likes

We will not use singleton pool in this version.

2 Likes

Weekly update 18 - 24 April (in milestone “Run multiview for consensus v1 on testnet”)

  • After discussing with the pdex bridge team, we decided that the current version is safe to deploy. As the fixed proposer mechanism will prevent forks, at the current stage, we will not care much in revert cases.
  • @hungngo worked on partition database for each chain (shard, beacon), so that we could backup easily a certain chain.
  • We continue to implement a feature on Highway that deliberately makes fork cases based on the predefined scenario.
  • This week, we reviewed and merged new portal features to our code. We also fixed some minor bugs about concurrency and sinker.

Next week, we want to test again the code with other features before deploying on testnet.

5 Likes

Hi @team

I’m working to review 2nd for multiview on pull request 880(after your team merged portal v2 for a final pull request). And I have some trouble with ChainInterface. I know that we have blsbft for fixed node version and blsbftv2 for non-fixed node version. So we have 2 interface here
blsbft
image
blsbftv2
image

But, the fact that I also have a ChainInterface in consensus folder, for What?
image
This is no need for our design, right?

3 Likes

And by the way, in case we only use ChainInterface of blsbft folder. So, where is function GetPubkeyRole
image
In my memory, I remember this function is used for some function of pNode of app team for checking staking node. If this function is removed, you guys checked to related feature on RPC, right?

2 Likes

ChainInterface in consensus/interface.go is no longer is used, @0xkumi plz confirm this

1 Like

Hi @team,

We complete reivew PR 880. Tomorrow, we will deploy this on testnet

Congratulation!

2 Likes

Weekly update 25 April - 1 May (in milestone “Run multiview for consensus v1 on testnet”)

  • We are ready to launch multiview on testnet this week, however, there is a delay due to holiday time.
  • In May, the team will work on the next milestone which we allow to round-robin the proposer (consensus v2). One step closer to decentralize our network.
  • Regard to team resource, @dungtran and @lam will not join in next phase as they will research on another topic.
4 Likes

I’m running some validator node with this multiview solution by docker tag consensus_multiview_20200504_1 on testnet. Because we have some change in state of beacon and shard, this version should sync testnet data again. When these node is ready for consensus, I will deploy for all testnet node. Hope these will be synced full data of testnet on tomorrow and we can do that on tomorrow afternoon

Thanks

1 Like

Hi @0xkumi @hungngo and team,
I tried to run a node with multiview and realized that it did not sync the data of shard 2, although this node is a validator of shard 2.

Please look into this issue.

Thanks

1 Like

Hi @0xkumi and team

We have an issue with portal V2, can not sync data, so we will delay deploy consensus-v2 -multiview on testnet and run an old version with portal v2 and sync from scratch again with code of portal v2 to make sure this is not an issue of your team.

I will notice later.

Thanks

1 Like

Hi @team, with your consensus-v2 multiview, I can sync data of testnet. But we have a new issue that validator can not join consensus and send vote, it only receive other vote
image

Please look over this issue and fix it

Thanks

1 Like

Thank you @thaibao we have issue with new database layout. And already fixed it.

Weekly update 2 May - 8 May (in milestone “Run multiview for consensus v2 on testnet”)

  • This week, we have some issues of multiview protocol related to database and fork logic. They all fixed and we are re-run on testnet.
  • QC team also start to test consensus v2 which allows round-robin proposer.
  • The fork case simulation tool also finish this week.

Next week, we will focus on testing consensus v2 code and run simulation cases.

2 Likes