Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chassis][multi-dut]changes to support fabric asic in gen-mg and chassis TestbedProcessing #3245

Merged
merged 1 commit into from Jun 21, 2021

Conversation

saravanansv
Copy link
Contributor

  1. Support chassis, multi-duts scenarios in TestbedProcessing.
    When testbed.yaml file contains cardtype=Linecard or cardtype=supervisor,
    TestbedProcessing will add that to lab, veos inventory files.
    When dut is multi-dut, TestbedProcessing will convert the dut type list to string.
  2. Support fabric_info generation in fabric_info.py when the num_asic > 0 and cardtype is supervisor.
  3. Use the fabric_info in minigraph templates to generated the fabric asic info
  4. Use variable name cardtype instead of type in ansible, dut_utils.py and minigraph templates to be more explicit
  5. allow t2 as a topo in testbed.py

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@saravanansv saravanansv requested a review from a team as a code owner March 31, 2021 06:31
@lgtm-com
Copy link

lgtm-com bot commented Mar 31, 2021

This pull request introduces 8 alerts when merging 006f5ae into 9a50ce4 - view on LGTM.com

new alerts:

  • 6 for Except block handles 'BaseException'
  • 2 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Apr 1, 2021

This pull request introduces 8 alerts when merging 7be2fa2 into a2a435c - view on LGTM.com

new alerts:

  • 6 for Except block handles 'BaseException'
  • 2 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Apr 1, 2021

This pull request introduces 1 alert when merging 855381c into df0da30 - view on LGTM.com

new alerts:

  • 1 for Except block handles 'BaseException'

@saravanansv saravanansv force-pushed the master-chassistest branch 3 times, most recently from 51ca10a to ecb8d83 Compare April 3, 2021 05:28
@anshuv-mfst
Copy link

@yxieca @wangxin - could you please take a look, thanks.

@saravanansv saravanansv force-pushed the master-chassistest branch 2 times, most recently from afca295 to 14c18f3 Compare April 7, 2021 22:26
@yxieca yxieca requested a review from abdosi April 8, 2021 22:35
@yxieca
Copy link
Collaborator

yxieca commented Apr 8, 2021

@abdosi can you check this PR as well?

@saravanansv
Copy link
Contributor Author

@abdosi can you check this PR as well?

Thanks for the approval @yxieca
@abdosi thanks for reviewing in advance

Copy link
Contributor

@arlakshm arlakshm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as comments

ansible/TestbedProcessing.py Show resolved Hide resolved
ansible/library/fabric_info.py Outdated Show resolved Hide resolved
@@ -204,7 +204,7 @@ def write_line_break(self, data=None):
explicit_start=True, Dumper=IncIndentDumper)

def get_testbed_type(self, topo_name):
pattern = re.compile(r'^(t0|t1|ptf|fullmesh|dualtor)')
pattern = re.compile(r'^(t0|t1|ptf|fullmesh|dualtor|t2)')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we restrict to VoQ based T2 device ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a non VoQ based T2 ?
how to determine if this is VoQ based T2 or not in testbed.py ?

ansible/templates/minigraph_dpg_asic.j2 Show resolved Hide resolved
ansible/templates/minigraph_png.j2 Show resolved Hide resolved
@saravanansv saravanansv force-pushed the master-chassistest branch 2 times, most recently from cc729cb to 5a9b97a Compare April 20, 2021 18:50
@saravanansv saravanansv force-pushed the master-chassistest branch 2 times, most recently from cda4e96 to 1e0663a Compare May 6, 2021 21:03
@lgtm-com
Copy link

lgtm-com bot commented May 6, 2021

This pull request introduces 7 alerts when merging 1e0663a into 03744ce - view on LGTM.com

new alerts:

  • 7 for Except block handles 'BaseException'

@saravanansv saravanansv force-pushed the master-chassistest branch 4 times, most recently from a8d0530 to bbb6d2c Compare May 7, 2021 03:49
@anshuv-mfst
Copy link

@SuvarnaMeenakshi - please take a look, thanks

@lgtm-com
Copy link

lgtm-com bot commented May 26, 2021

This pull request introduces 7 alerts when merging 3ac9b72 into 27e39a9 - view on LGTM.com

new alerts:

  • 7 for Except block handles 'BaseException'

@SuvarnaMeenakshi
Copy link
Contributor

This pull request introduces 7 alerts when merging 3ac9b72 into 27e39a9 - view on LGTM.com

new alerts:

  • 7 for Except block handles 'BaseException'

@saravanansv LGTM alert should be fixed.

ansible/library/fabric_info.py Outdated Show resolved Hide resolved
ansible/library/fabric_info.py Show resolved Hide resolved
hwsku: "{{ hwsku }}"
num_fabric_asic: "{{ num_fabric_asics | default(0) }}"
asics_host_pfx: "{{ asics_host_ip | default(None) }}"
asics_host_pfx6: "{{ asics_host_ipv6 | default(None) }}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do these variables get their value?

  • num_fabric_asics
  • asics_host_ip
  • asics_host_ipv6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are coming from testbed.yaml -> lab -> config_sonic_basedon_testbed.yml

key = "ASIC%d" % asic_id
data = { 'asicname': key,
'ip_prefix': asics_host_pfx % asic_id,
'ip6_prefix': asics_host_pfx6 % asic_id }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that variables like asics_host_pfx and asics_host_pfx6 must be a string template. I assume it is like: 10.1.0.%s/32
In the template, %s will be replaced with asic_id which is an integer between 0 and num_fabric_asic.

In the example output:

      ansible_facts{
        fabric_info: [{'asicname': 'ASIC0', 'ip_prefix': '10.1.0.33/32', 'ip6_prefix': 'FC00:1::33/128'},
                      {'asicname': 'ASIC1', 'ip_prefix': '10.1.0.34/32', 'ip6_prefix': 'FC00:1::34/128'}]
      }

It seems that there is no way to generate the IP addresses in the example based on a provided prefix and an integer asic ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asics_host_pfx is '10.1.0.%d/32'.
do you want more comments in fabric_info ? or add a sample chassis-dut case to testbed-new.yml ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to comments with sample arguments that can be passed for asics_host_pfx/asics_host_pfxv6

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are two issues here:

  1. The sample in comments does not match the actual behavior of test code.
  2. Does generating ip_prefix for each ASIC based on asic_id and string template matches the behavior of how the actual ASIC ip_prefix is generated? Is the actual ASIC ip_prefix generated from a base IP plus asic_id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding 2., yes the generated ip_prefix, asicname with asic_id does work well in the testings so far.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fix issue 2?

Probably the better way is to add asic_id to a base IP to generate ASIC IP?

ansible/library/fabric_info.py Outdated Show resolved Hide resolved
Copy link
Contributor

@sanmalho-git sanmalho-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add BGPRouterDeclaration for the remote linecards in minigraph_cpg.j2. Other than that, looks good.

ansible/config_sonic_basedon_testbed.yml Outdated Show resolved Hide resolved
ansible/library/port_alias.py Show resolved Hide resolved

if card_type != 'supervisor':
entry += "\tstart_switchid=" + str( start_switchid )
if num_asic is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per SAI VoQ switch proposal (https://github.com/opencomputeproject/SAI/blob/master/doc/VoQ/SAI-Proposal-VoQ-Switch.md#212-sai_switch_attr_switch_id) the switch_id needs to consider number of cores per chip(asic) as well as skip indexes.

https://github.com/opencomputeproject/SAI/blob/master/doc/VoQ/SAI-Proposal-VoQ-Switch.md#16-guideline-for-numbering-switches describes naming the switch ids.

One way to solve this would be to change the name from 'start_switchid' to be 'switchid' and have it be specified in the yml file. To support, multi-asic linecards, this could be a list.

For a single asic:
switchid=[6]

For 3 asic  linecard:
switchid=[6,8,10]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now the inputs are

  1. start_switchid
  2. default assumption that num_cores is 1

Apart from these two inputs, num_asics can also be an input to TestbedProcessing so that, it can generate the 6,8,10

So that we can make the math done by the TestbedProcessing instead of passing as inputing a list with manual calculations

print("\t\t" + host + " asics_host_ipv6 not found")

try: #get voq_inband_ip
voq_inband_ip = dev.get("voq_inband_ip")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To support multi-asic linecards, it might be helpful to make 'voq_inband_ip', 'voq_inband_ipv6' and 'voq_inband_port' a list with values per asic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can either be list like you suggested or
just pass voq_inband_ip: "1.1.1.%d/24" % ( switchid ) where switchid iterates from start_switchid till num_asics

@saravanansv saravanansv force-pushed the master-chassistest branch 2 times, most recently from eca7822 to a2afad3 Compare June 2, 2021 21:55
@yxieca
Copy link
Collaborator

yxieca commented Jun 7, 2021

@wangxin @arlakshm are you guys good with this PR?

@wangxin
Copy link
Collaborator

wangxin commented Jun 8, 2021

I am OK with this PR except one issue.

In the fabric_info.py file, I think using string template interpolating asic_id to generate IP prefix for each ASIC is not the correct way. Probably adding ASIC ID to a base IP to get IP prefix for each ASIC is a better way.

@saravanansv
Copy link
Contributor Author

saravanansv commented Jun 9, 2021

I am OK with this PR except one issue.

In the fabric_info.py file, I think using string template interpolating asic_id to generate IP prefix for each ASIC is not the correct way. Probably adding ASIC ID to a base IP to get IP prefix for each ASIC is a better way.

Addressed this by using ipaddress module in python to do the ipaddr-str to int conversion
Tagging @abdosi @sanmalho-git to keep them in loop about this change

…band, iBGP-peers

1. Support chassis, multi-duts scenarios in TestbedProcessing.
When testbed.yaml file contains card_type=Linecard or card_type=supervisor,
TestbedProcessing will add that to lab, veos inventory files.
When dut is multi-dut, TestbedProcessing will convert the dut type list to string.
makeMain needs to publish supported_vm_types=[...] into main.yml
2. Support fabric_info generation in fabric_info.py when the num_asic > 0 and card_type is supervisor.
3. Use the fabric_info in minigraph templates to generated the fabric asic info.
Also added SubRole=fabric, switch_type=fabric in DeviceMetadata for each fabric asic
4. Use variable name card_type instead of type in ansible, dut_utils.py and minigraph templates to be more explicit
5. allow t2 as a topo in testbed.py

Linecard gen-minigraph related changes:
1. support adding inband data in LC's minigraph.
VoqInbandInterfaces fields come from testbed.yaml (fields voq_inband_intf, voq_inband_type, voq_inband_ip) to lab/veos inventory files to minigraph.
Data flows from testbed.yaml (fields voq_inband_intf, voq_inband_type, voq_inband_ip) to lab/veos inventory files to minigraph.

2. System ports

Data flows from
a. port_config.ini (new fileds required are: numVoq, coreId, corePortId), (existing fields: name, speed)
b. switchId = running asic_id count across all linecards in the chassis
c. systemPortId = running systemport count across all linecards in the chassis
Each dut adds its own system-ports to its own ansible_facts.
config_sonic_basedon_testbed.yml loops through all duts system-ports to create all_sysports
d. added changes to port_alias to generate system ports for Recirc ports as well
e. hostname comes from config_sonic_basedon_testbed.yml, asicname created in port_alias from asic_id when looping through num_asics

3. DeviceProperty
   <a:DeviceProperty>
     <a:Name>SwitchType</a:Name>
     <a:Reference i:nil="true"/>
     <a:Value>voq</a:Value>
   </a:DeviceProperty>
   <a:DeviceProperty>
     <a:Name>MaxCores</a:Name>
     <a:Reference i:nil="true"/>
     <a:Value>16</a:Value>
   </a:DeviceProperty>
   <a:DeviceProperty>
     <a:Name>SwitchId</a:Name>
     <a:Reference i:nil="true"/>
     <a:Value>0</a:Value>
   </a:DeviceProperty>

a. switch type, maxcores are directly from testbed.yaml to inventory files to minigraph
b. start_switchid is calculated for each linecard and set into inventory files from TestbedProcessing.py based on num_asics from previous linecards

4. iBGP peers
iBGP sessions are setup between inband-ipaddress on all linecards
a. take all voq_inband_ip to form a list all_inbands in ansible/config_sonic_basedon_testbed.yml
b. minigraph_cpg iterates through all_inbands and configures a BGPSession, BGPPeer between
the voq_inband_ip and all other addresses in all_inbands
@anshuv-mfst
Copy link

Hi @wangxin - could you please help with merge asap, other test PR have dependency on this.

@wangxin wangxin merged commit e6741d3 into sonic-net:master Jun 21, 2021
@saravanansv
Copy link
Contributor Author

I am OK with this PR except one issue.
In the fabric_info.py file, I think using string template interpolating asic_id to generate IP prefix for each ASIC is not the correct way. Probably adding ASIC ID to a base IP to get IP prefix for each ASIC is a better way.

Addressed this by using ipaddress module in python to do the ipaddr-str to int conversion

@abdosi part of addressing this review comment, the import ipaddress got added

arlakshm pushed a commit that referenced this pull request Sep 20, 2021
…3746)

What is the motivation for this PR?
Support for generating minigraph for multi-asic linecards in a VoQ chassis is missing. Also, for multi-asic linecards we needed to add support for Loopback4096 interfaces, such that they can be used as the router-id for the iBGP connectivity.

How did you do it?
Support for minigraph generation for single asic linecards and supervisor card was done as part of PR #3245. This introduced several fields into the ansible inventory required for VoQ chassis linecard. However, for multi-asic linecards these fields in the inventory (from TestbedProcessing.py) are converted to lists instead of a single value. The fields changed are:

switchids - instead of start_switchid
voq_inband_ip
voq_inband_ipv6
voq_inband_intf
We also added the following fields to support Loopback4096 interfaces on the asics of a multi-asic linecard.

loopback4096_ip
loopback4096_ipv6
The minigraph templates were modified to use generate asic level metadata for all asics in a linecard.
We added minigraph_dpg_voq_asic.j2 template to generate asic DPG definitions for multi-asic linecards.

Also modified test_bgp_facts to include BGP_VOQ_CHASSIS_NEIGHBOR's in its checking of all bgp neighbors.

How did you verify/test it?
Generated minigraphs for single asic and multi-asic linecards and supervisor cards and validated configs generated using 'config load_minigraph' command in SONiC.
vmittal-msft pushed a commit to vmittal-msft/sonic-mgmt that referenced this pull request Sep 28, 2021
…band, iBGP-peers (sonic-net#3245)

1. Support chassis, multi-duts scenarios in TestbedProcessing.
When testbed.yaml file contains card_type=Linecard or card_type=supervisor,
TestbedProcessing will add that to lab, veos inventory files.
When dut is multi-dut, TestbedProcessing will convert the dut type list to string.
makeMain needs to publish supported_vm_types=[...] into main.yml
2. Support fabric_info generation in fabric_info.py when the num_asic > 0 and card_type is supervisor.
3. Use the fabric_info in minigraph templates to generated the fabric asic info.
Also added SubRole=fabric, switch_type=fabric in DeviceMetadata for each fabric asic
4. Use variable name card_type instead of type in ansible, dut_utils.py and minigraph templates to be more explicit
5. allow t2 as a topo in testbed.py

Linecard gen-minigraph related changes:
1. support adding inband data in LC's minigraph.
VoqInbandInterfaces fields come from testbed.yaml (fields voq_inband_intf, voq_inband_type, voq_inband_ip) to lab/veos inventory files to minigraph.
Data flows from testbed.yaml (fields voq_inband_intf, voq_inband_type, voq_inband_ip) to lab/veos inventory files to minigraph.

2. System ports

Data flows from
a. port_config.ini (new fileds required are: numVoq, coreId, corePortId), (existing fields: name, speed)
b. switchId = running asic_id count across all linecards in the chassis
c. systemPortId = running systemport count across all linecards in the chassis
Each dut adds its own system-ports to its own ansible_facts.
config_sonic_basedon_testbed.yml loops through all duts system-ports to create all_sysports
d. added changes to port_alias to generate system ports for Recirc ports as well
e. hostname comes from config_sonic_basedon_testbed.yml, asicname created in port_alias from asic_id when looping through num_asics

3. DeviceProperty
   <a:DeviceProperty>
     <a:Name>SwitchType</a:Name>
     <a:Reference i:nil="true"/>
     <a:Value>voq</a:Value>
   </a:DeviceProperty>
   <a:DeviceProperty>
     <a:Name>MaxCores</a:Name>
     <a:Reference i:nil="true"/>
     <a:Value>16</a:Value>
   </a:DeviceProperty>
   <a:DeviceProperty>
     <a:Name>SwitchId</a:Name>
     <a:Reference i:nil="true"/>
     <a:Value>0</a:Value>
   </a:DeviceProperty>

a. switch type, maxcores are directly from testbed.yaml to inventory files to minigraph
b. start_switchid is calculated for each linecard and set into inventory files from TestbedProcessing.py based on num_asics from previous linecards

4. iBGP peers
iBGP sessions are setup between inband-ipaddress on all linecards
a. take all voq_inband_ip to form a list all_inbands in ansible/config_sonic_basedon_testbed.yml
b. minigraph_cpg iterates through all_inbands and configures a BGPSession, BGPPeer between
the voq_inband_ip and all other addresses in all_inbands
vmittal-msft pushed a commit to vmittal-msft/sonic-mgmt that referenced this pull request Sep 28, 2021
…onic-net#3746)

What is the motivation for this PR?
Support for generating minigraph for multi-asic linecards in a VoQ chassis is missing. Also, for multi-asic linecards we needed to add support for Loopback4096 interfaces, such that they can be used as the router-id for the iBGP connectivity.

How did you do it?
Support for minigraph generation for single asic linecards and supervisor card was done as part of PR sonic-net#3245. This introduced several fields into the ansible inventory required for VoQ chassis linecard. However, for multi-asic linecards these fields in the inventory (from TestbedProcessing.py) are converted to lists instead of a single value. The fields changed are:

switchids - instead of start_switchid
voq_inband_ip
voq_inband_ipv6
voq_inband_intf
We also added the following fields to support Loopback4096 interfaces on the asics of a multi-asic linecard.

loopback4096_ip
loopback4096_ipv6
The minigraph templates were modified to use generate asic level metadata for all asics in a linecard.
We added minigraph_dpg_voq_asic.j2 template to generate asic DPG definitions for multi-asic linecards.

Also modified test_bgp_facts to include BGP_VOQ_CHASSIS_NEIGHBOR's in its checking of all bgp neighbors.

How did you verify/test it?
Generated minigraphs for single asic and multi-asic linecards and supervisor cards and validated configs generated using 'config load_minigraph' command in SONiC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants