Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Inband VLan #1555

Closed
wants to merge 24 commits into from
Closed

Conversation

ysmanman
Copy link
Contributor

The PR adds Inband VLan support.

What I did

  • intfsorch change to configure IP address for Inband Vlan in kernel.
  • neighorch change to skip adding remote neighbors attached to Inband Vlan.
  • portsorch change to add Inband Vlan.

Why I did it

Support inband VLan

How I verified it

Details if related

This PR has some dependency on PR #1431 and some minor change is needed once that PR is merged.

- intfsorch change to configure IP address for Inband Vlan in kernel.
- neighorch change to skip adding remote neighbors attached to Inband Vlan.
- portsorch change to add Inband Vlan.
@anshuv-mfst
Copy link

@vganesan-nokia (Nokia) could you please review the PR, thanks.

Copy link
Contributor

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do not see any swss vstest being added.

if(ibport.m_type == Port::VLAN)
{
/* Only create neighbor attached to inband Vlan */
if (m_remoteNeigh.find(ip_address) != m_remoteNeigh.end())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can Inband Vlan have neighbors other than remote neighbors? If not, we do not need to have collecting remote neigh. Just checking the interface of the neigh itself is enough to skip (as it is done for the Inband port type).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. iBGP peer can be resolved over inband Vlan. This kind of inband vlan neighbor needs to be created. But remote neighbors can be skipped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia why same cannot happen with Inband port ? iBGP neighbor can happen on inband port also ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only neighbors that would be available on the inband port are the static neigbors we add for the remote neighbors (which should be skipped here since these are already added to SAI as part of doVoqSystemNeighTask()). Unlike inband vlan, inband port will not have any other neighbors. So no need to collect or verify whether the neihbor is remote or not. Just check the interface of the neighbor and if the interface is inband interface, skip the neighbor.

For inband port except the ip address of inband port of my asic, all the iBGP neighbors (which are ip address of inband port of other asics) are remote neighbors. These are already programmed in SAI by doVoqSystemNeighTask() so we skip here. But inband vlan will have both the iBGP neighbors (resolved via ARP) which should be programmed here and remote neighbors which are added statically to inband vlan and SAI prgrammed in doVoqSystemNeighTask(). Therefore for inband vlan case, we need to indentify the remote neighbors and skip.

*
* doVoqSystemNeighTask()
*/
if (gPortsOrch->isInbandPort(alias))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. Whole part of this change should be merged with existing Inband port type handling. PR 1431 is merged. Please rebase and adjust the changes accordingly so that the comments can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I will rebase and remove comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines +3438 to +3440
attr.id = SAI_BRIDGE_PORT_ATTR_ADMIN_STATE;
attr.value.booldata = true;
attrs.push_back(attr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the code can be moved in the if block at line 3444. The check for the port.m_type != Port::SYSTEM is in two places unnecessarily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Will fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

* The following method is called from setVoqInbandIntf() introduced in the following PR:
* https://github.com/Azure/sonic-swss/pull/1431
*/
bool PortsOrch::addInbandVlan(const string &alias, const string &type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. PR 1431 is merged. setVoqInbandIntf() can be modified to call this api if the inband type is "vlan". Please rebase and modify setVoqInbandIntf() to make this complete and the comments can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I will rebase and remove the comment.

Comment on lines +4679 to +4689
if (!addVlan(alias))
{
SWSS_LOG_ERROR("Failed to add inband VLAN %s", alias.c_str());
return false;
}

if (!getPort(alias, inbandVlan))
{
SWSS_LOG_ERROR("Failed to get port object of inband VLAN %s", alias.c_str());
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not do this vlan addition processing via vlan configuration by having entry in VLAN table in config_db? So that this will be similar to inband port type. For port type, the VOQ_INBAND_INTERFACE configuration is just to designate a port or vlan for interface for cpu-to-cpu communication. We actually do not configure the port or vlan itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We thought about this too. But inband VLAN is not exactly the same as regular SONiC VLANs. For example, the inband VLAN members are cpu ports, which do not have host intfs being created today. Therefore, inband VLAN processing is different from VLAN processing.

@ysmanman
Copy link
Contributor Author

i do not see any swss vstest being added.

Sure, I will add a test.

Comment on lines 570 to 582
if(ibport.m_type == Port::VLAN)
{
/* For inband Vlan, a kernel neigh is added for every remote neighbor
* and the interface that the neigh is attached is inband Vlan.
* neighsyncd adds the kernel neigh to neigh table in APPL_DB again.
* The following change is to skip adding these remote neighbor attached
* to inband Vlan.
*/
if (m_remoteNeigh.find(ip_address) != m_remoteNeigh.end())
{
it = consumer.m_toSync.erase(it);
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic can go as else part (after line 560). The getPortOrch->isInbandPort(alias) and getPortsOrch->getInbandPort(ibport) are redundantly called multiple times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

Comment on lines 1252 to 1264
Port inbandPort;
if (!gPortsOrch->getInbandPort(inbandPort))
{
SWSS_LOG_ERROR("Failed to get inband port/VLAN");
return;
}

if (inbandPort.m_type == Port::VLAN && inbandPort.m_alias == alias)
{
SWSS_LOG_NOTICE("Skip inband VLAN neigh: %s", ip_address.to_string().c_str());
return;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be great if comments are added for this block of code saying that "inband neighbors are not synced to CHASSIS_APP_DB for inband interface type VLAN since the inband neigibors are learned via arp learning within inband VLAN"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comment as suggested.

SWSS_LOG_INFO("Added inband VLAN %s to STATE_VLAN_TABLE", inbandVlan.m_alias.c_str());
}

m_inbandVlanTimer->stop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to stop the timer when there is failure in bringing up inband vlan in the kernel? Will it be a good idea to leave the timer running for re-tries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. Fixed.

vganesan-nokia
vganesan-nokia previously approved these changes Feb 23, 2021
@vganesan-nokia
Copy link
Contributor

@ysmanman, thanks for addressing the comments. Approved the changes.

vganesan-nokia
vganesan-nokia previously approved these changes Mar 23, 2021

//Timer to bring up inband Vlan netdev
auto intervT = timespec { .tv_sec = INBAND_INTF_TIMER_SEC , .tv_nsec = 0 };
m_inbandVlanTimer = new SelectableTimer(intervT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need timer ? Can you point to design doc that cover this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since SAI create call is not synchronized, we cannot bring up inband vlan in kernel immediately after creating inband vlan with SAI call. We use this timer to periodically check if it's ready to bring up inband vlan in kernel or not. The timer is stopped once inband vlan is brought up in kernel successfully. I think this is implementation details, and thus is not covered in HLD.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysmanman how is this different w.r.t Regular VLAN IP Interface ? We don;t need timer there.

Copy link
Contributor Author

@ysmanman ysmanman Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abdosi There are few differences between regular Vlan and Inban Vlan: 0) The members of Inband Vlan are local and remote CPU ports, which don't have kernel devices created today. 1) Inband Vlan is a static configuration and imembership is unlikely user configurable. These are the main reasons that Inband Vlan was implemented different from regular Vlans and therefore we added the timer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sai call is synchronized now starting from 202012 release

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lguohan thanks for the comment. I removed the inband vlan timer since sai call is synchronized now.

{
Port port = it.second;
if (port.m_type != Port::SYSTEM ||
port.m_system_port_info.type == SAI_SYSTEM_PORT_TYPE_LOCAL ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can local cpu port be covered in this same loop ? basically this will make code common w.r.t API addBridgePort() and addVlanMemeber()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. I will try to see if I can move local cpu port processing into the for loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as what you suggested.

attr.value.s32 = found->second;
/* Create a bridge port with admin status set to UP */
attr.id = SAI_BRIDGE_PORT_ATTR_ADMIN_STATE;
attr.value.booldata = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we skip this for system port ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both local and remote cpu ports need to be added into inband vlan. The port type of remote cpu ports is system port, and thus we have to add the check here to skip the processing that's not required for system port.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelli10 Can you help. We can add system port as part of Bridge. Can we set SAI_BRIDGE_PORT_ATTR_ADMIN_STATE for Remote Systrem Port and what does it mean ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abdosi SAI_BRIDGE_PORT_ATTR_ADMIN_STATE is not currently supported for remote system port.

attr.id = SAI_BRIDGE_PORT_ATTR_FDB_LEARNING_MODE;
auto found = learn_mode_map.find(port.m_learn_mode);
if (found == learn_mode_map.end())
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also any reason to skip this for system port ? code here is generic for add systemport as bridge port.
Yes for inband use case this not needed but i feel this code should be kept generic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both local and remote cpu ports need to be added into inband vlan. The port type of remote cpu ports is system port, and thus we have to add the check here to skip the processing that's not required for system port.

// Inband neighbors are not synced to CHASSIS_APP_DB for VLAN type inband interface
// since these neigibors are learned within inband VLAN.
if (inbandPort.m_type == Port::VLAN && inbandPort.m_alias == alias)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this case only for inband vlan ? Can you please explain which case it falls as per https://github.com/Azure/SONiC/blob/master/doc/voq/voq_hld.md ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In inband vlan case, each asic (precisely its cpu port) participates in inband vlan. Each asic has its own inband IP address that belongs to the same inband vlan network. Each asic learns the inband IPs of other asics as neighbors. These neighbors are mainly used for the communication between SONiC instances like routing protocol peering. Therefor, we added this change to not add inband neighbors into system neigh table. This is described in inband vlan section (2.5.2) in voq HLD.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

FieldValueTuple macFv;
if (inbandPort.m_type == Port::VLAN)
{
macFv = FieldValueTuple("neigh", gMacAddress.to_string());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we are putting our own mac when sync to chassis app db ? remote side will use this to program SAI . Why we need to send our own mac to remote side ? as per design doc also SAI neighbor tables uses actual mac for vlan inband case. can you please clarify.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to install kernel neighbor entries for remote neighbors. In inband vlan case, the mac of the kernel neighbor entry must be the router mac of remote asic, where the remote neighbor was learnt. This was described in section 2.5.2.2 of voq HLD. Use neighbor nbr-0-1 as an example shown in that section. nbr-0-1 is learnt by asic0. A kernel neighbor entry is installed for the neighbor in asic1 with asic0's mac. This is why local router mac is published in chassis app db here. SAI programming for remote neighbor uses encap index, which is an indirection of the actual mac of remote neighbor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. looks like i overlooked it

Copy link
Contributor

@abdosi abdosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some changes and clarification

@ysmanman
Copy link
Contributor Author

Added some changes and clarification

@abdosi Thanks for review. I replied/resolved all your comments.

if (m_vidToRidTable->hget("", id, value))
{
stringstream cmd;
cmd << "ip link set " << inbandVlan.m_alias << " up";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot do "ip link" command from orchagent. This approach has to be revisited.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny can you elaborate why we cannot do this in orchagent?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no kernel interactions from orchagent. In the current architecture, this is handled by the manager daemons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny Thanks for the explanation. The kernel interaction was moved to intfmgr as suggested.

vganesan-nokia
vganesan-nokia previously approved these changes Apr 30, 2021
@lgtm-com
Copy link

lgtm-com bot commented May 7, 2021

This pull request introduces 1 alert and fixes 1 when merging f3dc29d into e29d566 - view on LGTM.com

new alerts:

  • 1 for Syntax error

fixed alerts:

  • 1 for Unused import

@ysmanman
Copy link
Contributor Author

ysmanman commented May 7, 2021

i still do not see any vs tests being added.

@lguohan The plan was to re-use the same test added in inband port PR #1602 (so we run the same test twice: one with inband port config and one with inband vlan config). inband port PR was merged recently. I am working on adding the test in this PR.

@lguohan The inband vlan test was added. We reused the inband port test. With the change in this PR, the same test is run twice: one with inband port, and one with inband vlan.

@anshuv-mfst
Copy link

@lguohan , @abdosi - could you please approve and merge (critical PR)

vganesan-nokia
vganesan-nokia previously approved these changes May 12, 2021
@abdosi
Copy link
Contributor

abdosi commented May 13, 2021

where are we doing this ?

image

@ysmanman
Copy link
Contributor Author

ysmanman commented May 13, 2021

where are we doing this ?

image

@abdosi I checked just now. When programming inband vlan neighbor, orchagent passes the RIF of inband vlan to SAI and in HW, the neighbor entry is programmed to local CPU port. It seems SAI/Sdk is picking up local CPU port when the rif is inband vlan. So the HLD may need to be updated.

I don't agree with this. SAI should not be aware of Special case How does it know given RIF is inband RIF ? How do we track this with SAI ?

@ysmanman
Copy link
Contributor Author

where are we doing this ?
image

@abdosi I checked just now. When programming inband vlan neighbor, orchagent passes the RIF of inband vlan to SAI and in HW, the neighbor entry is programmed to local CPU port. It seems SAI/Sdk is picking up local CPU port when the rif is inband vlan. So the HLD may need to be updated.

I don't agree with this. SAI should not be aware of Special case How does it know given RIF is inband RIF ? How do we track this with SAI ?

@abdosi DNX SAI is not aware of inband RIF. The SAI behavior is for all Vlans. More details. When the RIF type of a neighbor is Vlan and the MAC is not learnt, SAI uses local CPU port as destination. MAC learning of inband Vlan is not enabled, therefore SAI picks up cpu port in this case.

@abdosi
Copy link
Contributor

abdosi commented May 15, 2021

where are we doing this ?
image

@abdosi I checked just now. When programming inband vlan neighbor, orchagent passes the RIF of inband vlan to SAI and in HW, the neighbor entry is programmed to local CPU port. It seems SAI/Sdk is picking up local CPU port when the rif is inband vlan. So the HLD may need to be updated.
I don't agree with this. SAI should not be aware of Special case How does it know given RIF is inband RIF ? How do we track this with SAI ?

@abdosi DNX SAI is not aware of inband RIF. The SAI behavior is for all Vlans. More details. When the RIF type of a neighbor is Vlan and the MAC is not learnt, SAI uses local CPU port as destination. MAC learning of inband Vlan is not enabled, therefore SAI picks up cpu port in this case.

Still does not look correct. This also is not standard SAI Implementation . Some SAI do Drop, some might Flood in VLAN. So we can not rely on SAI implementation.

@prsunny do we know if SAI has define standard model what should happen if for VLAN RIF Mac is not learnt ?

@prsunny
Copy link
Collaborator

prsunny commented May 17, 2021

where are we doing this ?
image

@abdosi I checked just now. When programming inband vlan neighbor, orchagent passes the RIF of inband vlan to SAI and in HW, the neighbor entry is programmed to local CPU port. It seems SAI/Sdk is picking up local CPU port when the rif is inband vlan. So the HLD may need to be updated.
I don't agree with this. SAI should not be aware of Special case How does it know given RIF is inband RIF ? How do we track this with SAI ?

@abdosi DNX SAI is not aware of inband RIF. The SAI behavior is for all Vlans. More details. When the RIF type of a neighbor is Vlan and the MAC is not learnt, SAI uses local CPU port as destination. MAC learning of inband Vlan is not enabled, therefore SAI picks up cpu port in this case.

Still does not look correct. This also is not standard SAI Implementation . Some SAI do Drop, some might Flood in VLAN. So we can not rely on SAI implementation.

@prsunny do we know if SAI has define standard model what should happen if for VLAN RIF Mac is not learnt ?

Agree with @abdosi , if mac is not learnt, some platforms may flood within VLAN.

@arlakshm
Copy link
Contributor

@ysmanman, today we add addDirectedBroadcast neighbor entry for vlan. Do we need to skip that for inband vlan
https://github.com/Azure/sonic-swss/blob/6c4c9adda3621d1ae5adead1a91468d532ff792f/orchagent/intfsorch.cpp#L501-L504

@ysmanman
Copy link
Contributor Author

@ysmanman, today we add addDirectedBroadcast neighbor entry for vlan. Do we need to skip that for inband vlan
https://github.com/Azure/sonic-swss/blob/6c4c9adda3621d1ae5adead1a91468d532ff792f/orchagent/intfsorch.cpp#L501-L504
@arlakshm you are right. We can skip directed broadcast for inband vlan. Will update PR.

@abdosi
Copy link
Contributor

abdosi commented Jun 2, 2021

Based on latest discussion with Arista this PR is put on hold as of now.

@rlhui
Copy link
Contributor

rlhui commented Feb 3, 2022

@ysmanman - if this is not needed, please close it. Thanks.

@ysmanman
Copy link
Contributor Author

ysmanman commented Feb 4, 2022

@ysmanman - if this is not needed, please close it. Thanks.

@rlhui yes, this is not needed any more. I will close it.

@ysmanman ysmanman closed this Feb 4, 2022
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
Dropped IP validation as server takes URL.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants