Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[voq/systemlag] Lag id boundary setting for system lag functionality #6488

Merged
merged 1 commit into from Mar 31, 2021

Conversation

vganesan-nokia
Copy link
Contributor

- Why I did it

System lag in VOQ based chassis systems requires unique id for system port aggregator id attribute in LAG object.
This id is generated by lag id generator within set boundary. The boundaries are system capability specific and should
be set during system boot.

Ref: sonic-net/SONiC#697

- How I did it

Changes for setting platform specific lag id boundary id in the chassis
app db. The platform specific lag id boundaries are supplied via
chassisdb.conf in the supervisor card. The lag_id_start and lag_id_end boundary values sourced
from this file are set in chassis app db which will be used by lag id
allocator to allocate unique lag id in atomic fashion

- How to verify it

  • Update chassisdb.conf in supervisor card with two parameters "lag_id_start" and "lag_id_end"
  • Reboot the system.
  • Connect to redis-chassis server. There should be "SYSTEM_LAG_ID_START" and "SYSTEM_LAG_ID_END" keys set with value as given in the chassisdb.conf.
  • Verify that LAG id objects are created in ASIC DB.

- Which release branch to backport (provide reason below if selected)

- Description for the changelog

  • doker_image_ctl.j2 modified to create database.sh with script modification to write lag id keys in chassis app db

- A picture of a cute animal (not mandatory but encouraged)

@anshuv-mfst
Copy link

@judyjoseph - could you please review this PR, thanks.

Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>

Changes for setting platfrom specific lag id boundary id in the chassis
app db. The platfrom specific lag id boundaries are supplied via
chassisdb.conf. The lag_id_start and lag_id_end boundary values sourced
from this file are set in chassis app db which will be used by lag id
allocator to allocate unique lag id in atomic fashion
@judyjoseph
Copy link
Contributor

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@judyjoseph
Copy link
Contributor

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@judyjoseph judyjoseph merged commit b313d4d into sonic-net:master Mar 31, 2021
raphaelt-nvidia pushed a commit to raphaelt-nvidia/sonic-buildimage that referenced this pull request May 23, 2021
Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>

Changes for setting platfrom specific lag id boundary id in the chassis
app db. The platfrom specific lag id boundaries are supplied via
chassisdb.conf. The lag_id_start and lag_id_end boundary values sourced
from this file are set in chassis app db which will be used by lag id
allocator to allocate unique lag id in atomic fashion
CHASSIS_CONF=/usr/share/sonic/device/$PLATFORM/chassisdb.conf
if [ -f "$CHASSIS_CONF" ]; then
source $CHASSIS_CONF
$SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia We tested this change and it's not working at our side. $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start" failed with this error: Invalid database name input : 'CHASSIS_APP_DB'. However, it worked fine if we do this inside chassisDb docker. E.g.,

docker exec -i ${DOCKERNAME} $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start"

So it seems sonic-db-cli is not aware of CHASSIS_APP_DB when databash.sh is executed. Any thought on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysmanman, This is supposed to be run only in database-chassis docker in supervisor card only. Assuming you are running this in database-chassis docker in supervisor, please check if you have the CHASSIS_APP_DB defined in database_config.json for redis-chassis instance (i.e. in file /var/run/redis-chassis/sonic-db/database_config.json)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia I am not suer that this is actually run in database-chassis docker based on the following code:

https://github.com/Azure/sonic-buildimage/blob/master/files/build_templates/docker_image_ctl.j2#L174-L178

Line 174-175 are running outside the docker, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysmanman: you are right. It is run outside docker. We can modify the function setPlatfromLagIdBoundaries() to do "docker exec ...". However we do not have to do this or even to run this inside docker. The error you are seeing is because of this CHASSIS_APP_DB not defined in /var/run/redis/sonic-db/database_config.json. We are supposed to have the same content of database_config.json in all instances of /var/run/redis<instance>/sonic-db. If CHASSIS_APP_DB is defined in database_config.json with correct instance and if the instance info is specified, sonic-cfggen will connect to correct redis server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia We do have CHASSIS_APP_DB defined in database_config.json once switch boots up. However, it appears that /var/run/redis/sonic-db/database_config.json didn't exist yet when setPlatformLagIdBoundaries was invoked. The file was installed after chassisDb was up. Do you know if SONiC ensures /var/run/redis/sonic-db/database_config.json is installed before bringing up chassisDb?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia Unfortunately, we were observing different from what you explained. The way we set lag id range successfully is to modify setPlatformLagIdBoundaries to set it directly inside chassis database docker like this:

docker exec -i ${DOCKERNAME} $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start"

When setPlatformLagIdBoundaries wass called in the script (ref. database.sh) that starts the database docker, it seems only /var/run/redis-chassis/sonic-db/database_config.json was available but not /var/run/redis/sonic-db/database_config.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysmanman, this is as designed. Chassis db starts before regular database starts. So we do not expect the /var/run/redis/sonic-db/database_config.json be available when database-chassis container is started and redis-chassis server started inside database-chassis container. I just checked the docker_image_ctl.j2. The function setPlatformLagIdBoundaries() is missing some piece of code. I do not know how we missed to push these changes to github. Following is the function we have:

function setPlatformLagIdBoundaries()
{
    if [ ! -e "/var/run/redis/sonic-db/database_config.json" ]; then
        mkdir -p /var/run/redis/sonic-db/
        cp /var/run/redis-chassis/sonic-db/database_config.json /var/run/redis/sonic-db
    fi
    CHASSIS_CONF=/usr/share/sonic/device/$PLATFORM/chassisdb.conf
    if [ -f "$CHASSIS_CONF" ]; then
        source $CHASSIS_CONF
        $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start"
        $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_END" "$lag_id_end"
    fi
}

But your fix is also valid. I'll put a PR to fix this. Thanks for helping to find this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia Thanks for checking and confirming!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia Thanks for checking and confirming!

@ysmanman, raised issue #7912 and fixed the issue by PR #7911

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia thanks for the update!

carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this pull request Aug 7, 2021
Signed-off-by: vedganes <vedavinayagam.ganesan@nokia.com>

Changes for setting platfrom specific lag id boundary id in the chassis
app db. The platfrom specific lag id boundaries are supplied via
chassisdb.conf. The lag_id_start and lag_id_end boundary values sourced
from this file are set in chassis app db which will be used by lag id
allocator to allocate unique lag id in atomic fashion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants