What is OpenSM
What is OpenSM
https://docs.nvidia.com/networking/display/mlnxofedv461000/opensm
#
# See the comments in the following example.
# They explain different keywords and their meaning.
#
port-groups
port-group # using port GUIDs
name: Storage
# "use" is just a description that is used for logging
# Other than that, it is just a comment
use: SRP Targets
port-guid: 0x10000000000001, 0x10000000000005-0x1000000000FFFA
port-guid: 0x1000000000FFFF
end-port-group
port-group
name: Virtual Servers
# The syntax of the port name is as follows:
# "node_description/Pnum".
# node_description is compared to the NodeDescription of the node,
# and "Pnum" is a port number on that node.
port-name: “vs1 HCA-1/P1, vs2 HCA-1/P1”
end-port-group
# using partitions defined in the partition policy
port-group
name: Partitions
partition: Part1
pkey: 0x1234
end-port-group
# using node types: CA, ROUTER, SWITCH, SELF (for node that runs SM)
# or ALL (for all the nodes in the subnet)
port-group
name: CAs and SM
node-type: CA, SELF
end-port-group
end-port-groups
qos-setup
# This section of the policy file describes how to set up SL2VL and VL
# Arbitration tables on various nodes in the fabric.
# However, this is not supported in OFED - the section is parsed
# and ignored. SL2VL and VLArb tables should be configured in the
# OpenSM options file (by default - /var/cache/opensm/opensm.opts).
end-qos-setup
qos-levels
# Having a QoS Level named "DEFAULT" is a must - it is applied to
# PR/MPR requests that didn't match any of the matching rules.
qos-level
name: DEFAULT
use: default QoS Level
sl: 0
end-qos-level
# the whole set: SL, MTU-Limit, Rate-Limit, PKey, Packet Lifetime
qos-level
name: WholeSet
sl: 1
mtu-limit: 4
rate-limit: 5
pkey: 0x1234
packet-life: 8
end-qos-level
end-qos-levels
# Match rules are scanned in order of their apperance in the policy file.
# First matched rule takes precedence.
qos-match-rules
# matching by single criteria: QoS class
qos-match-rule
use: by QoS class
qos-class: 7-9,11
# Name of qos-level to apply to the matching PR/MPR
qos-level-name: WholeSet
end-qos-match-rule
# show matching by destination group and service id
qos-match-rule
use: Storage targets
destination: Storage
service-id: 0x10000000000001, 0x10000000000008-0x10000000000FFF
qos-level-name: WholeSet
end-qos-match-rule
qos-match-rule
source: Storage
use: match by source group only
qos-level-name: DEFAULT
end-qos-match-rule
qos-match-rule
use: match by all parameters
qos-class: 7-9,11
source: Virtual Servers
destination: Storage
service-id: 0x0000000000010000-0x000000000001FFFF
pkey: 0x0F00-0x0FFF
qos-level-name: WholeSet
end-qos-match-rule
end-qos-match-rules
Simple QoS Policy - Details and Examples
Simple QoS policy match rules are tailored for matching ULPs (or some application on top of a ULP) PR/MPR requests. This section has a list of per-ULP (or per-application) match rules and the SL that should be enforced on the matched PR/MPR query.Match rules include:
Default match rule that is applied to PR/MPR query that didn't match any of the other match rules
IPoIB with a default PKey
IPoIB with a specific PKey
Any ULP/application with a specific Service ID in the PR/MPR query
Any ULP/application with a specific PKey in the PR/MPR query
Any ULP/application with a specific target IB port GUID in the PR/MPR query
Since any section of the policy file is optional, as long as basic rules of the file are kept (such as no referring to nonexistent port group, having default QoS Level, etc), the simple policy section (qos-ulps) can serve as a complete QoS policy file.The shortest policy file in this case would be as follows:
qos-ulps default : 0 #default SL end-qos-ulps
It is equivalent to the previous example of the shortest policy file, and it is also equivalent to not having policy file at all. Below is an example of simple QoS policy with all the possible keywords:
qos-ulps default :0 # default SL sdp, port-num 30000 :0 # SL for application running on # top of SDP when a destination # TCP/IPport is 30000 sdp, port-num 10000-20000 : 0 sdp :1 # default SL for any other # application running on top of SDP rds :2 # SL for RDS traffic ipoib, pkey 0x0001 :0 # SL for IPoIB on partition with # pkey 0x0001 ipoib :4 # default IPoIB partition, # pkey=0x7FFF any, service-id 0x6234:6 # match any PR/MPR query with a # specific Service ID any, pkey 0x0ABC :6 # match any PR/MPR query with a # specific PKey srp, target-port-guid 0x1234 : 5 # SRP when SRP Target is located # on a specified IB port GUID any, target-port-guid 0x0ABC-0xFFFFF : 6 # match any PR/MPR query # with a specific target port GUID end-qos-ulps
Similar to the advanced policy definition, matching of PR/MPR queries is done in order of appearance in the QoS policy file such as the first match takes precedence, except for the “default” rule, which is applied only if the query didn’t match any other rule. All other sections of the QoS policy file take precedence over the qos-ulps section. That is, if a policy file has both qos-match-rules and qos-ulps sections, then any query is matched first against the rules in the qos-match-rules section, and only if there was no match, the query is matched against the rules in qos-ulps section.Note that some of these match rules may overlap, so in order to use the simple QoS definition effectively, it is important to understand how each of the ULPs is matched.
IPoIB
IPoIB query is matched by PKey or by destination GID, in which case this is the GID of the multicast group that OpenSM creates for each IPoIB partition.Default PKey for IPoIB partition is 0x7fff, so the following three match rules are equivalent:
ipoib:
SRP
Service ID for SRP varies from storage vendor to vendor, thus SRP query is matched by the tar- get IB port GUID. The following two match rules are equivalent:
srp, target-port-guid 0x1234 :
Note that any of the above ULPs might contain target port GUID in the PR query, so in order for these queries not to be recognised by the QoS manager as SRP, the SRP match rule (or any match rule that refers to the target port GUID only) should be placed at the end of the qos-ulps match rules.
MPI
SL for MPI is manually configured by an MPI admin. OpenSM is not forcing any SL on the MPI traffic, which explains why it is the only ULP that did not appear in the qos-ulps section.
SL2VL Mapping and VL Arbitration
OpenSM cached options file has a set of QoS related configuration parameters, that are used to configure SL2VL mapping and VL arbitration on IB ports. These parameters are:
Max VLs: the maximum number of VLs that will be on the subnet
High limit: the limit of High Priority component of VL Arbitration table (IBA 7.6.9)
VLArb low table: Low priority VL Arbitration table (IBA 7.6.9) template
VLArb high table: High priority VL Arbitration table (IBA 7.6.9) template
SL2VL: SL2VL Mapping table (IBA 7.6.6) template. It is a list of VLs corresponding to SLs 0-15 (Note that VL15 used here means drop this SL).
There are separate QoS configuration parameters sets for various target types: CAs, routers, switch external ports, and switch’s enhanced port 0. The names of such parameters are prefixed by “qos_
qos_ca_ - QoS configuration parameters set for CAs.
qos_rtr_ - parameters set for routers.
qos_sw0_ - parameters set for switches' port 0.
qos_swe_ - parameters set for switches' external ports.
Here’s the example of typical default values for CAs and switches’ external ports (hard-coded in OpenSM initialisation):
qos_ca_max_vls 15 qos_ca_high_limit 0 qos_ca_vlarb_high 0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0 qos_ca_vlarb_low 0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4 qos_ca_sl2vl 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7 qos_swe_max_vls 15 qos_swe_high_limit 0 qos_swe_vlarb_high 0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0 qos_swe_vlarb_low 0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4 qos_swe_sl2vl 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
VL arbitration tables (both high and low) are lists of VL/Weight pairs. Each list entry contains a VL number (values from 0-14), and a weighting value (values 0-255), indicating the number of 64 byte units (credits) which may be transmitted from that VL when its turn in the arbitration occurs. A weight of 0 indicates that this entry should be skipped. If a list entry is programmed for VL15 or for a VL that is not supported or is not currently configured by the port, the port may either skip that entry or send from any supported VL for that entry.
Note, that the same VLs may be listed multiple times in the High or Low priority arbitration tables, and, further, it can be listed in both tables. The limit of high-priority VLArb table (qos_
If the 255 value is used, the low priority VLs may be starved.
A value of 0 indicates that only a single packet from the high-priority table may be sent before an opportunity is given to the low-priority table. Keep in mind that ports usually transmit packets of size equal to MTU. For instance, for 4KB MTU a single packet will require 64 credits, so in order to achieve effective VL arbitration for packets of 4KB MTU, the weighting values for each VL should be multiples of 64. Below is an example of SL2VL and VL Arbitration configuration on subnet:
qos_ca_max_vls 15 qos_ca_high_limit 6 qos_ca_vlarb_high 0:4 qos_ca_vlarb_low 0:0,1:64,2:128,3:192,4:0,5:64,6:64,7:64 qos_ca_sl2vl 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7 qos_swe_max_vls 15 qos_swe_high_limit 6 qos_swe_vlarb_high 0:4 qos_swe_vlarb_low 0:0,1:64,2:128,3:192,4:0,5:64,6:64,7:64 qos_swe_sl2vl 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
In this example, there are 8 VLs configured on subnet: VL0 to VL7. VL0 is defined as a high priority VL, and it is limited to 6 x 4KB = 24KB in a single transmission burst. Such configuration would suilt VL that needs low latency and uses small MTU when transmitting packets. Rest of VLs are defined as low priority VLs with different weights, while VL4 is effectively turned off.
Deployment Example
The figure below shows an example of an InfiniBand subnet that has been configured by a QoS manager to provide different service levels for various ULPs.
QoS Deployment on InfiniBand Subnet Example
worddave17a745cfa7444e9ff94e5f27a5f1986.png
Enhanced QoS
Enhanced QoS provides a higher resolution of QoS at the service level (SL). Users can configure rate limit values per SL for physical ports, virtual ports, and port groups, using enhanced_qos_policy_file configuration parameter.Valid values of this parameter:
Full path to the policy file through which Enhanced QoS Manager is configured
"null" - to disable the Enhanced QoS Manager (default value)
Warning
To enable Enhanced QoS Manager, QoS must be enabled in OpenSM.
Enhanced QoS Policy File
The policy file is comprised of two sections:
BW_NAMES: Used to define bandwidth setting and name (currently, rate limit is the only setting). Bandwidth names are defined using the syntax:
=
Example: My_bandwidth = 50
BW_RULES: Used to define the rules that map the bandwidth setting to a specific SL of a specific GUID. Bandwidth rules are defined using the syntax:
| = :, :…
Examples:
0x2c90000000025 = 5:My_bandwidth, 7:My_bandwidth
Port_grp1 = 3:My_bandwidth, 9:My_bandwidth
Notes:
When rate limit is set to 0 - it means that there is an unlimited rate limit.
Any unspecified SL in a rule will be set to 0 (unlimited) rate limit automatically.
"default" is a well-known name which can be used to define a default rule used for any GUID with no defined rule (If no default rule is defined, any GUID without a specific rule will be configured with unlimited rate limit for all SLs).
Failure to complete policy file parsing leads to an undefined behaviour. User must confirm no relevant error messages in SM log in order to ensure Enhanced QoS Manager is configured properly.
An empty file with only 'BW_NAMES' and 'BW_RULES' keywords configures the network with an unlimited rate limit.
Policy File Example
The below is an example of configuring all ports in the fabric with rate limit of 50Mbps on SL1, except for GUID 0x2c90000000025, which is configured with rate limit of 100Mbps on SL1. In this example, all SLs (other than SL1) are unlimited.
BW_NAMES bw1 = 50 bw2 = 100
BW_RULES default= 1:bw1 0x2c90000000025= 1:bw2 ————————————————————————
QoS Configuration Examples
The following are examples of QoS configuration for different cluster deployments. Each example provides the QoS level assignment and their administration via OpenSM configuration files. Typical HPC Example: MPI and Lustre Assignment of QoS Levels
MPI
Separate from I/O load
Min BW of 70%
Storage Control (Lustre MDS)
Low latency
Storage Data (Lustre OST)
Min BW 30%
Administration
MPI is assigned an SL via the command linehost1# mpirun –sl 0
OpenSM QoS policy file
qos-ulps
default :0 # default SL (for MPI)
any, target-port-guid OST1,OST2,OST3,OST4 :1 # SL for Lustre OST
any, target-port-guid MDS1,MDS2 :2 # SL for Lustre MDS
end-qos-ulps
Note: In this policy file example, replace OST* and MDS* with the real port GUIDs.
OpenSM options file
qos_max_vls 8
qos_high_limit 0
qos_vlarb_high 2:1
qos_vlarb_low 0:96,1:224
qos_sl2vl 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15
EDC SOA (2-tier): IPoIB and SRP
The following is an example of QoS configuration for a typical enterprise data centre (EDC) with service oriented architecture (SOA), with IPoIB carrying all application traffic and SRP used for storage. QoS Levels
Application traffic
IPoIB (UD and CM) and SDP
Isolated from storage
Min BW of 50%
SRP
Min BW 50%
Bottleneck at storage nodes
Administration
OpenSM QoS policy file
qos-ulps
default :0
ipoib :1
sdp :1
srp, target-port-guid SRPT1,SRPT2,SRPT3 :2
end-qos-ulps
Note: In this policy file example, replace SRPT* with the real SRP Target port GUIDs.
OpenSM options file
qos_max_vls 8
qos_high_limit 0
qos_vlarb_high 1:32,2:32
qos_vlarb_low 0:1,
qos_sl2vl 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15
EDC (3-tier): IPoIB, RDS, SRP
The following is an example of QoS configuration for an enterprise data centre (EDC), with IPoIB carrying all application traffic, RDS for database traffic, and SRP used for storage. QoS Levels
Management traffic (ssh)
IPoIB management VLAN (partition A)
Min BW 10%
Application traffic
IPoIB application VLAN (partition B)
Isolated from storage and database
Min BW of 30%
Database Cluster traffic
RDS
Min BW of 30%
SRP
Min BW 30%
Bottleneck at storage nodes
Administration
OpenSM QoS policy file
qos-ulps
default :0
ipoib, pkey 0x8001 :1
ipoib, pkey 0x8002 :2
rds :3
srp, target-port-guid SRPT1, SRPT2, SRPT3 :4
end-qos-ulps
Note: In the following policy file example, replace SRPT* with the real SRP Initiator port GUIDs.
OpenSM options file
qos_max_vls 8
qos_high_limit 0
qos_vlarb_high 1:32,2:96,3:96,4:96
qos_vlarb_low 0:1
qos_sl2vl 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15
Partition configuration file
Default=0x7fff,ipoib : ALL=full;PartA=0x8001, sl=1, ipoib : ALL=full;
Adaptive Routing Manager and SHIELD
Adaptive Routing Manager supports advanced InfiniBand features; Adaptive Routing (AR) and Self-Healing Interconnect Enhancement for InteLligent Datacenters (SHIELD).
For information on how to set up AR and SHIELD, please refer to HowTo Configure Adaptive Routing and SHIELD Community post.
Congestion Control Manager
Congestion Manager works in conjunction with Congestion Control implemented on the Switch. To verify whether your switch supports Congestion Control, refer to the switches Firmware Release Notes . Congestion Control Manager is a Subnet Manager (SM) plug-in, i.e. it is a shared library (libc- cmgr.so) that is dynamically loaded by the Subnet Manager. Congestion Control Manager is installed as part of Mellanox OFED installation. The Congestion Control mechanism controls traffic entry into a network and attempts to avoid over-subscription of any of the processing or link capabilities of the intermediate nodes and networks. Additionally, is takes resource reducing steps by reducing the rate of sending packets. Congestion Control Manager enables and configures Congestion Control mechanism on fabric nodes (HCAs and switches).
Running OpenSM with Congestion Control Manager
Congestion Control (CC) Manager can be enabled/disabled through SM options file. To do so, perform the following:
Create the file. Run:
opensm -c <options-file-name>'
Find the 'event_plugin_name' option in the file, and add 'ccmgr' to it.
Event plugin name(s)
event_plugin_name ccmgr
Run the SM with the new options file: 'opensm -F <options-file-name>'
Warning
Once the Congestion Control is enabled on the fabric nodes, to completely disable Congestion Control, you will need to actively turn it off. Running the SM w/o the CC Manager is not sufficient, as the hardware still continues to function in accordance to the previous CC configuration.
For further information on how to turn OFF CC, please refer to “ Configuring Congestion Control Manager” section below.
Configuring Congestion Control Manager
Congestion Control (CC) Manager comes with a predefined set of setting. However, you can fine-tune the CC mechanism and CC Manager behaviour by modifying some of the options. To do so, perform the following:
Find the 'event_plugin_options' option in the SM options file, and add the following:
conf_file <cc-mgr-options-file-name>':
Options string that would be passed to the plugin(s)
event_plugin_options ccmgr --conf_file <cc-mgr-options-file-name>
Run the SM with the new options file: 'opensm-F<options-file-name>'.
Warning
To turn CC OFF, set ‘enable’ to ‘FALSE’ in the Congestion Control Manager configuration file, and run OpenSM ones with this configuration.
For further details on the list of CC Manager options, please refer to the IB spec.
Configuring Congestion Control Manager Main Settings
To fine-tune CC mechanism and CC Manager behaviour, and set the CC manager main settings, enable/disable Congestion Control mechanism on the fabric nodes, set the following
Parameter
Values
Default
enable
| <TRUE | FALSE> |
TRUE
CC manager configures CC mechanism behaviour based on the fabric size. The larger the fabric is, the more aggressive CC mechanism is in its response to congestion. To manually modify CC manager behaviour by providing it with an arbitrary fabric size, set the following parameter:
Parameter
Values
Default
num_hosts
[0-48K]
0 (based on the CCT calculation on the current subnet size)
The smaller the number value of the parameter, the faster HCAs will respond to the congestion and will throttle the traffic. Note that if the number is too low, it will result in suboptimal bandwidth. To change the mean number of packets between marking eligible packets with a FECN, set the following parameter:
Parameter
Values
Default
marking_rate
[0-0xffff]
0xa
You can set the minimal packet size that can be marked with FECN. Any packet less than this size [bytes] will not be marked with FECN. To do so, set the following parameter:
Parameter
Values
Default
packet_size
[0-0x3fc0]
0x200
When number of errors exceeds 'max_errors' of send/receive errors or timeouts in less than 'error_window' seconds, the CC MGR will abort and will allow OpenSM to proceed. To do so, set the following parameters:
Parameter
Values
Default
max_errors
0: zero tollerance - abort configuration on first error
error_window
0: mechanism disabled - no error checking.[0-48K]
5
Congestion Control Manager Options File
Option File
Description
Values
Default Value
enable
Enables/disables Congestion Control mechanism on the fabric nodes.
| <TRUE | FALSE> |
TRUE
num_hosts
Indicates the number of nodes. The CC table values are calculated based on this number.
[0-48K]
0 (base on the CCT calculation on the current subnet size)
threshold
Indicates how aggressive the congestion mark- ing should be.
[0-0xf]
0 - no packet marking
0xf - very aggressive
0xf
marking_rate
The mean number of packets between marking eligible packets with a FECN
[0-0xffff]
0xa
packet_size
Any packet less than this size [bytes] will not be marked with FECN.
[0-0x3fc0]
0x200
port_control
Specifies the Congestion Control attribute for this port
0 - QP based congestion control
1 - SL/Port based congestion control
0
ca_control_- map
An array of sixteen bits, one for each SL. Each bit indicates whether or not the corresponding SL entry is to be modified.
0xffff
ccti_increase
Sets the CC Table Index (CCTI) increase.
1
trigger_threshold
Sets the trigger threshold.
2
ccti_min
Sets the CC Table Index (CCTI) minimum.
0
cct
Sets all the CC table entries to a specified value . The first entry will remain 0, whereas last value will be set to the rest of the table.
Values:
0
When the value is set to 0, the CCT calculation is based on the number of nodes.
ccti_timer
Sets for all SL’s the given ccti timer.
0
When the value is set to 0, the CCT calculation is based on the number of nodes.
max_errors error_window
When number of errors exceeds ‘max_errors’ of send/receive errors or time outs in less than ‘error_window’ seconds, the CC MGR will abort and will allow OpenSM to proceed.
max_errors = 0: zero tolerance - abort configuration on first error.
error_window = 0: mechanism disabled - no error checking.
5
DOS MAD Prevention
DOS MAD prevention is achieved by assigning a threshold for each agent’s RX. Agent’s RX threshold provides a protection mechanism to the host memory by limiting the agents’ RX with a threshold. Incoming MADs above the threshold are dropped and are not queued to the agent’s RX.
Procedure_Heading_Icon.PNG
To enable DOS MAD Prevention:
Go to /etc/modprobe.d/mlnx.conf.
Add to the file the option below.
ib_umad enable_rx_threshold 1
The threshold value can be controlled from the user-space via libibumad.
To change the value, use the following API:
int umad_update_threshold(int fd, int threshold);
@fd: file descriptor, agent’s RX associated to this fd. @threshold: new threshold value
MAD Congestion Control Warning
MAD Congestion Control is supported in both mlx4 and mlx5 drivers.
The SA Management Datagrams (MAD) are General Management Packets (GMP) used to communicate with the SA entity within the InfiniBand subnet. SA is normally part of the subnet manager, and it is contained within a single active instance. Therefore, congestion on the SA communication level may occur. Congestion control is done by allowing max_outstanding MADs only, where outstanding MAD means that is has no response yet. It also holds a FIFO queue that holds the SA MADs that their sending is delayed due to max_outstanding overflow. The length of the queue is queue_size and meant to limit the FIFO growth beyond the machine memory capabilities. When the FIFO is full, SA MADs will be dropped, and the drops counter will increment accordingly. When time expires (time_sa_mad) for a MAD in the queue, it will be removed from the queue and the user will be notified of the item expiration. This features is implemented per CA port. The SA MAD congestion control values are configurable using the following sysfs entries:
/sys/class/infiniband/mlx5_0/mad_sa_cc/ +– 1 ¦ +– drops ¦ +– max_outstanding ¦ +– queue_size ¦ +– time_sa_mad +– 2 +– drops +– max_outstanding +– queue_size +– time_sa_mad
Procedure_Heading_Icon.PNG
To print the current value:
cat /sys/class/infiniband/mlx5_0/mad_sa_cc/1/max_outstanding 16
change the current value:
echo 32 > /sys/class/infiniband/mlx5_0/mad_sa_cc/1/max_outstanding cat /sys/class/infiniband/mlx5_0/mad_sa_cc/1/max_outstanding 32
reset the drops counter:
echo 0 > /sys/class/infiniband/mlx5_0/mad_sa_cc/1/drops
Note: The path to the parameter is similar in mlx4 driver:
/sys/class/infiniband/mlx4_0/mad_sa_cc/
Parameters’ Valid Ranges
Parameter
Range
Default Values
MIN
MAX
max_oustanding
1
2^20
16
queue_size
16
2^20
16
time_sa_mad
1 milliseconds
10000
20 milliseconds
IB Router Support in OpenSM
In order to enable the IB router in OpenSM, the following parameters should be configured:
IB Router Parameters for OpenSM
Parameter
Description
Default Value
rtr_pr_flow_label
Defines whether the SM should create alias GUIDs required for router support for each port.
Defines flow label value to use in response for path records related to the router.
0 (Disabled)
rtr_pr_tclass
Defines TClass value to use in response for path records related to the router
0
rtr_pr_sl
Defines sl value to use in response for path records related to router.
0
rtr_p_mtu
Defines MTU value to use in response for path records related to the router.
4 (IB_MTU_LEN_2048)
rtr_pr_rate
Defines rate value to use in response for path records related to the router.
16 (IB_PATH_RE- CORD_RATE_100_GBS)
OpenSM Activity Report
OpenSM can produce an activity report in a form of a dump file which details the different activities done in the SM. Activities are divided into subjects. The OpenSM Supported Activities table below specifies the different activities currently supported in the SM activity report.Reporting of each subject can be enabled individually using the configuration parameter activity_report_subjects:
Valid values:Comma separated list of subjects to dump. The current supported subjects are:
"mc" - activity IDs 1, 2 and 8
"prtn" - activity IDs 3, 4, and 5
"virt" - activity IDs 6 and 7
"routing" - activity IDs 8-12
Two predefined values can be configured as well:
"all" - dump all subjects
"none" - disable the feature by dumping none of the subjects
Default value: "none"
OpenSM Supported Activities
ACtivity ID
Activity Name
Additional Fields
Comments
Description
1
mcm_member
MLid
MGid
Port Guid
Join State
Join state:
1 - Join
-1 - Leave
Member joined/ left MC group
2
mcg_change
MLid
MGid
Change
Change:
0 - Create
1 - Delete
MC group created/deleted
3
prtn_guid_add
Port Guid
PKey
Block index
Pkey Index
Guid added to partition
4
prtn_create
-PKey
Prtn Name
Partition created
5
prtn_delete
PKey
Delete Reason
Delete Reason:
0 - empty prtn
1 - duplicate prtn
2 - sm shutdown
Partition deleted
6
port_virt_discover
Port Guid
Top Index
Port virtualisation discovered
7
vport_state_change
Port Guid
VPort Guid
VPort Index
VNode Guid
VPort State
VPort State:
1 - Down
2 - Init
3 - ARMED
4 - Active
Vport state changed
8
mcg_tree_calc
mlid
MCast group tree calculated
9
routing_succeed
routing engine name
Routing done successfully
10
routing_failed
routing engine name
Routing failed
11
ucast_cache_invali- dated
ucast cache invalidated
12
ucast_cache_rout- ing_done
ucast cache routing done Offsweep Balancing
When working with minhop/dor/updn, subnet manager can re-balance routing during idle time (between sweeps).
offsweep_balancing_enabled - enables/disables the feature. Examples:
offsweep_balancing_enabled = TRUE
offsweep_balancing_enabled = FALSE (default)
offsweep_balancing_window - defines window of seconds to wait after sweep before starting the re-balance process. Applicable only if offsweep_balancing_enabled=TRUE. Example:
offsweep_balancing_window = 180 (default)
© Copyright 2023, NVIDIA. Last updated on Oct 23, 2023.