Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[systemd] ASIC status based service bringup on VOQ chassis #7477

Merged
merged 12 commits into from Jul 28, 2021

Conversation

mprabhu-nokia
Copy link
Contributor

@mprabhu-nokia mprabhu-nokia commented Apr 29, 2021

Changes to allow starting per asic services like swss and syncd only if the platform vendor codedetects the asic is detected and notified. The systemd services ordering we want is database->database@->pmon->swss@->syncd@->teamd@->lldp@
There is also a requirement that management, telemetry, snmp dockers can start even if all asic services are not up.

Why I did it

For VOQ chassis, the fabric cards will have 1-N asics. Also, there could be multiple removable fabric cards. On the supervisor, swss and syncd containers need to be started only if the fabric-card is in Online state and respective asics are detected by the kernel. Using systemd, the dependent services can be in inactive state.

How I did it

Introduce a mechanism where all ASIC dependent service wait on its state to be published via PMON to REDIS. Once the subscription is received, the service proceeds to create respective dockers.
For fixed platforms, systemd is unchanged i.e. the service bring up and docker creation happens in the start()/ExecStartPre routine of the .sh scripts.
For VOQ chassis platform on supervisor, the service bringup skips docker creation in the start() routine, but does it in the wait()/ExecStart routine of the .sh scrips.
Management dockers are decoupled from ASIC docker creation.

Other options that were tried/discussed are below.
Option1
Introduce a new bootstrap service. Using systemd service options like "Before" and "Type=notify". The service will be kept waiting using "EXTEND_TIMEOUT_USEC" option for the notify socket.

Relying on systemd After=, Before= hard dependency features to make sure that swss@->syncd@ services come up only after the asic has been detected as Online. However, services like snmp have After=swss which creates a hard dependency on the service to be started.
We have to remove After=swss/syncd tag from snmp, telemetry, mgmt-framework services.

Option2
Use mask/unmask to control asic services

For swss/syncd services, on every boot, systemd-generator will mask these services only on the Supervisor card. When asic is detected by pmon, bootstrap-asic service will unmask swss/syncd for that asic.
(With this option, we may not require a per-asic bootstrap_asic.service. We can just have a single one)

How to verify it

Before asic1 is detected and notified

docker ps -a
docker-lldp:latest                   Exited (137) 4 minutes ago                       lldp1
docker-lldp:latest                   Exited (137) 4 minutes ago                       lldp2
docker-lldp:latest                   Exited (137) 4 minutes ago                       lldp0
docker-lldp:latest                   Up About a minute                                lldp
docker-sonic-mgmt-framework:latest   Up 3 seconds                                     mgmt-framework+++++
docker-sonic-telemetry:latest        Up 3 seconds                                     telemetry+++++
docker-snmp:latest                   Up 2 seconds                                     snmp+++++
docker-router-advertiser:latest      Exited (0) 2 seconds ago                         radv
docker-dhcp-relay:latest             Up About a minute                                dhcp_relay
docker-syncd-brcm:latest             Exited (0) 4 minutes ago                         syncd2
docker-syncd-brcm:latest             Exited (0) 4 minutes ago                         syncd1
docker-teamd:latest                  Exited (0) 4 minutes ago                         teamd1
docker-syncd-brcm:latest             Exited (0) 4 minutes ago                         syncd0
docker-teamd:latest                  Exited (0) 4 minutes ago                         teamd2
docker-teamd:latest                  Exited (0) 4 minutes ago                         teamd0
docker-platform-monitor:latest       Up About a minute                                pmon
docker-orchagent:latest              Exited (0) 4 minutes ago                         swss2
docker-orchagent:latest              Exited (0) 4 minutes ago                         swss0
docker-orchagent:latest              Exited (0) 4 minutes ago                         swss1
docker-fpm-frr:latest                Up About a minute                                bgp0
docker-fpm-frr:latest                Up About a minute                                bgp2
docker-fpm-frr:latest                Up About a minute                                bgp1
docker-database:latest               Up About a minute                                database2
docker-database:latest               Up About a minute                                database0
docker-database:latest               Up About a minute                                database1
docker-database:latest               Up 2 minutes                                     databa

Notify asic0 presence via REDIS
(will come from PMON as part of sonic-net/sonic-platform-daemons#175)

admin@localhost:~$ redis-cli
127.0.0.1:6379> SELECT 6
OK
127.0.0.1:6379[6]> HSET "CHASSIS_ASIC_TABLE|ASIC0" module "FABRIC-CARD0"
(integer) 1

After asic0 is detected and notified

adocker ps -a
IMAGE                                STATUS                       PORTS               NAMES
docker-lldp:latest                   Exited (137) 4 minutes ago                       lldp1
docker-lldp:latest                   Exited (137) 4 minutes ago                       lldp2
docker-lldp:latest                   Up 10 seconds                                    lldp0<<<<<
docker-lldp:latest                   Up 2 minutes                                     lldp
docker-sonic-mgmt-framework:latest   Up 26 seconds                                    mgmt-framework
docker-sonic-telemetry:latest        Up 26 seconds                                    telemetry
docker-snmp:latest                   Up 24 seconds                                    snmp
docker-router-advertiser:latest      Exited (0) 24 seconds ago                        radv
docker-dhcp-relay:latest             Up 2 minutes                                     dhcp_relay
docker-syncd-brcm:latest             Exited (0) 4 minutes ago                         syncd2
docker-syncd-brcm:latest             Exited (0) 4 minutes ago                         syncd1
docker-teamd:latest                  Exited (0) 5 minutes ago                         teamd1
docker-syncd-brcm:latest             Up 9 seconds                                     syncd0<<<<<
docker-teamd:latest                  Exited (0) 5 minutes ago                         teamd2
docker-teamd:latest                  Up 10 seconds                                    teamd0<<<<<
docker-platform-monitor:latest       Up 2 minutes                                     pmon
docker-orchagent:latest              Exited (0) 4 minutes ago                         swss2
docker-orchagent:latest              Up 9 seconds                                     swss0<<<<<
docker-orchagent:latest              Exited (0) 4 minutes ago                         swss1
docker-fpm-frr:latest                Up 2 minutes                                     bgp0
docker-fpm-frr:latest                Up 2 minutes                                     bgp2
docker-fpm-frr:latest                Up 2 minutes                                     bgp1
docker-database:latest               Up 2 minutes                                     database2
docker-database:latest               Up 2 minutes                                     database0
docker-database:latest               Up 2 minutes                                     database1
docker-database:latest               Up 2 minutes                                     database

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012

Description for the changelog

A picture of a cute animal (not mandatory but encouraged)

@arlakshm
Copy link
Contributor

arlakshm commented May 4, 2021

@mprabhu-nokia, today on multi-ASIC device SNMP for example has a dependency on SWSS and requires all the SWSS instances to start before it can start.
With this proposal, I think SNMP service might not become active unless all the fabric cards are present?

@mprabhu-nokia
Copy link
Contributor Author

mprabhu-nokia commented May 4, 2021

@mprabhu-nokia, today on multi-ASIC device SNMP for example has a dependency on SWSS and requires all the SWSS instances to start before it can start.
With this proposal, I think SNMP service might not become active unless all the fabric cards are present?

You are right, looks like all these services (may be some of these don't run on supervisor) have a dependency.

build_templates (master *%=) > grep After *.j2 | grep swss
dhcp_relay.service.j2:After=updategraph.service swss.service syncd.service teamd.service
gbsyncd.service.j2:After=swss.service
iccpd.service.j2:After=updategraph.service swss.service
lldp.service.j2:After=swss{% if multi_instance == 'true' %}@%i{% endif %}.service
macsec.service.j2:After=swss.service syncd.service
mgmt-framework.service.j2:After=database.service swss.service syncd.service
nat.service.j2:After=updategraph.service swss.service syncd.service
radv.service.j2:After=updategraph.service swss.service syncd.service
sflow.service.j2:After=swss.service syncd.service hostcfgd.service
snmp.service.j2:After=updategraph.service swss.service syncd.service
telemetry.service.j2:After=database.service swss.service syncd.service

@SuvarnaMeenakshi I see lldp has multiinstance swss as dependency and snmp does not, what would this mean? May be just that lldp itself is a per namespace service.

@SuvarnaMeenakshi
Copy link
Contributor

@mprabhu-nokia, today on multi-ASIC device SNMP for example has a dependency on SWSS and requires all the SWSS instances to start before it can start.
With this proposal, I think SNMP service might not become active unless all the fabric cards are present?

You are right, looks like all these services (may be some of these don't run on supervisor) have a dependency.

build_templates (master *%=) > grep After *.j2 | grep swss
dhcp_relay.service.j2:After=updategraph.service swss.service syncd.service teamd.service
gbsyncd.service.j2:After=swss.service
iccpd.service.j2:After=updategraph.service swss.service
lldp.service.j2:After=swss{% if multi_instance == 'true' %}@%i{% endif %}.service
macsec.service.j2:After=swss.service syncd.service
mgmt-framework.service.j2:After=database.service swss.service syncd.service
nat.service.j2:After=updategraph.service swss.service syncd.service
radv.service.j2:After=updategraph.service swss.service syncd.service
sflow.service.j2:After=swss.service syncd.service hostcfgd.service
snmp.service.j2:After=updategraph.service swss.service syncd.service
telemetry.service.j2:After=database.service swss.service syncd.service

@SuvarnaMeenakshi I see lldp has multiinstance swss as dependency and snmp does not, what would this mean? May be just that lldp itself is a per namespace service.

Yes, that is right. lldp is per namespace service, so it depends only on that namespace specific swss service.
Whereas SNMP is a single service for the device, and depends on all swss namespace services.

@mprabhu-nokia
Copy link
Contributor Author

@mprabhu-nokia, today on multi-ASIC device SNMP for example has a dependency on SWSS and requires all the SWSS instances to start before it can start.
With this proposal, I think SNMP service might not become active unless all the fabric cards are present?

You are right, looks like all these services (may be some of these don't run on supervisor) have a dependency.

build_templates (master *%=) > grep After *.j2 | grep swss
dhcp_relay.service.j2:After=updategraph.service swss.service syncd.service teamd.service
gbsyncd.service.j2:After=swss.service
iccpd.service.j2:After=updategraph.service swss.service
lldp.service.j2:After=swss{% if multi_instance == 'true' %}@%i{% endif %}.service
macsec.service.j2:After=swss.service syncd.service
mgmt-framework.service.j2:After=database.service swss.service syncd.service
nat.service.j2:After=updategraph.service swss.service syncd.service
radv.service.j2:After=updategraph.service swss.service syncd.service
sflow.service.j2:After=swss.service syncd.service hostcfgd.service
snmp.service.j2:After=updategraph.service swss.service syncd.service
telemetry.service.j2:After=database.service swss.service syncd.service

@SuvarnaMeenakshi I see lldp has multiinstance swss as dependency and snmp does not, what would this mean? May be just that lldp itself is a per namespace service.

Yes, that is right. lldp is per namespace service, so it depends only on that namespace specific swss service.
Whereas SNMP is a single service for the device, and depends on all swss namespace services.

Thanks. @SuvarnaMeenakshi and @arlakshm, do we know the hard dependencies for starting snmp, telemetry etc only if swss is started? Is it just with respect to speeding up boot times for critical services?

files/build_templates/bootstrap-asic.service.j2 Outdated Show resolved Hide resolved
files/image_config/bootstrap-asic/bootstrap_asic.py Outdated Show resolved Hide resolved
return

def enable_feature(asic_name, feature_names, feature_suffixes):
start_cmds = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this service be used to call any platform specific script to generate any required config like config_db.json? I see on multi-asic system, there is no service which creates config_db json file. Now sonic-cfggen requires database services to be running before it may be called to generate config_db json.

This service, bootstrap_asic, starts after database services and is responsible to start the swss and syncd. That makes it in a good spot to let platforms generate any required configs. Could there be a platform hook provided here for platforms to do any specific initialization?

files/image_config/bootstrap-asic/bootstrap_asic.py Outdated Show resolved Hide resolved
@anshuv-mfst
Copy link

Please divide the PR into two parts:

  • Base support for max asic per hw sku
  • swss related changes

@shyam77git
Copy link
Contributor

Are the UT logs enclosed to conversation older/prior to latest commit (3rd one)?
UT logs referring to bootstrap-asic@0.service, bootstrap-asic@1.service etc. whereas to my understanding, latest commit refers refers to spwaning only one bootstrap-asic service masking services at init/bring-up time and then unmasking services as and when ASICs are being detected.

@anshuv-mfst
Copy link

Chassis Subgroup 5/26:
Decided to close #7621 and use this PR for further tracking and changes.

@mprabhu-nokia mprabhu-nokia changed the title [systemd] bootstrap service for pluggable fabric card on VOQ chassis [systemd] ASIC status based service bringup on VOQ chassis May 28, 2021
@anshuv-mfst
Copy link

@judyjoseph - could you please review and approve.

files/scripts/swss.sh Outdated Show resolved Hide resolved
@lgtm-com
Copy link

lgtm-com bot commented Jun 18, 2021

This pull request introduces 1 alert when merging bde49e42d932a6265538a1936f8211ce226aa0fb into f39fc88 - view on LGTM.com

new alerts:

  • 1 for Testing equality to None

@anshuv-mfst
Copy link

@judyjoseph - can you please merge asap please, needed for 202106 release

Each asic dependent service will wait in the ExecStart part of
systemd invocation  for a REDIS notification that respective
ASIC is now online.
Check if kvm-tests pass without changes
check_asic_status() modified
Additional review comments
module key update from module_name to name
Skip additional checks for fixed-platforms
Simplify the logic in the lldp, syncd, swss, teamd scripts.
asic_status checks can be done only for supervisor cards
#!/bin/bash

is_chassis_supervisor() {
if [ -f /etc/sonic/chassisdb.conf ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to find a different way to identify it is supervisor or linecard as this file /etc/sonic/chassisdb.conf is present in both sup and linecard

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@judyjoseph @abdosi
Quick clarification - The check is correct. In database-chassis.service, that is what is used to start the chassis-database only on the Supervisor. Since the file is not present on the line-card, chassis-database does not start there.

[Unit]
Description=database-chassis container
Requires=docker.service
ConditionPathExists=/etc/sonic/chassisdb.conf
After=docker.service
After=config-chassisdb.service
Requires=config-chassisdb.service
StartLimitIntervalSec=1200
StartLimitBurst=3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, yes - there is a similar reference here https://github.com/Azure/sonic-buildimage/blob/c508a10c08636c78a3fb4d7454cc93769b8fbf92/files/image_config/config-chassisdb/config-chassisdb#L40.

So even though the file /usr/share/sonic/device/$platform/chassisdb.conf, is present in both supervisor and linecard, the variable "start_chassis_db=1" is set in supervisor which is used to copy the file to /etc/sonic/chassisdb.conf

@mprabhu-nokia, @vganesan-nokia , @shyam77git, @anamehra -- Could you confirm this approach is followed in different chassis models ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shyam77git, @anamehra can you confirm and sign-off these changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the outset, this seems OK i.e. having and checking chassisdb.conf presence on/for Supervisor.
Anand is validating this further on modular chassis to see how it goes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of /etc/sonic/chassisdb.conf, may we have be some other mechanism like an entry in DEVICE_METADATA to distinguish between Supervisor and other cards? Non VOQ chassis may not need a chassis db.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested in PR #8065 (in review comments) we can have a variable like "supervisor=1" in platform_env.conf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, once that gets committed, we can take it up as part of a bug fix PR. Don't want to block this PR with additional dependencies now. Let me know if that is ok to proceed.

@judyjoseph judyjoseph requested a review from abdosi July 8, 2021 01:50
@judyjoseph
Copy link
Contributor

@anamehra @ngoc-do could you sign off as well ? based on the behavior with this patch in your chassis

Copy link
Contributor

@abdosi abdosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need sign-off from Cisco on this PR before this can be merged.

@mprabhu-nokia
Copy link
Contributor Author

@shyam77git @anamehra Could one of you please review this?

@anshuv-mfst
Copy link

@ngoc-do - can you please review from Arista team as well, thanks

logger.log_error('Detected no asics on this platform for service {}'.format(service))
sys.exit(1)

# Connect to STATE_DB and subscribe to chassis-module table notifications
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connect to CHASSIS_STATE_DB?

@judyjoseph judyjoseph merged commit 3fd6e8d into sonic-net:master Jul 28, 2021
judyjoseph pushed a commit that referenced this pull request Aug 4, 2021
Changes to allow starting per asic services like swss and syncd only if the platform vendor codedetects the asic is detected and notified. The systemd services ordering we want is database->database@->pmon->swss@->syncd@->teamd@->lldp@
There is also a requirement that management, telemetry, snmp dockers can start even if all asic services are not up.

Why I did it
For VOQ chassis, the fabric cards will have 1-N asics. Also, there could be multiple removable fabric cards. On the supervisor, swss and syncd containers need to be started only if the fabric-card is in Online state and respective asics are detected by the kernel. Using systemd, the dependent services can be in inactive state.

How I did it
Introduce a mechanism where all ASIC dependent service wait on its state to be published via PMON to REDIS. Once the subscription is received, the service proceeds to create respective dockers.
For fixed platforms, systemd is unchanged i.e. the service bring up and docker creation happens in the start()/ExecStartPre routine of the .sh scripts.
For VOQ chassis platform on supervisor, the service bringup skips docker creation in the start() routine, but does it in the wait()/ExecStart routine of the .sh scrips.
Management dockers are decoupled from ASIC docker creation.
carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this pull request Aug 7, 2021
…#7477)

Changes to allow starting per asic services like swss and syncd only if the platform vendor codedetects the asic is detected and notified. The systemd services ordering we want is database->database@->pmon->swss@->syncd@->teamd@->lldp@
There is also a requirement that management, telemetry, snmp dockers can start even if all asic services are not up.

Why I did it
For VOQ chassis, the fabric cards will have 1-N asics. Also, there could be multiple removable fabric cards. On the supervisor, swss and syncd containers need to be started only if the fabric-card is in Online state and respective asics are detected by the kernel. Using systemd, the dependent services can be in inactive state.

How I did it
Introduce a mechanism where all ASIC dependent service wait on its state to be published via PMON to REDIS. Once the subscription is received, the service proceeds to create respective dockers.
For fixed platforms, systemd is unchanged i.e. the service bring up and docker creation happens in the start()/ExecStartPre routine of the .sh scripts.
For VOQ chassis platform on supervisor, the service bringup skips docker creation in the start() routine, but does it in the wait()/ExecStart routine of the .sh scrips.
Management dockers are decoupled from ASIC docker creation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants