in

Surgient Success

Community Support Portal

NIC's down after adding host to pool?

Last post 10-19-2007 10:00 AM by admin. 18 replies.
Page 1 of 2 (19 items) 1 2 Next >
Sort Posts: Previous Next
  • 10-15-2007 8:24 AM

    NIC's down after adding host to pool?

    Hi, have encountered a very strange problem when adding a host to a newly created pool.

    I select the vswitch that is supposed to hold the VM's and it starts to create VM's etc.

    Then after awhile the vmnic connected to that vswitch goes down!

    At first i thought we had physical HW problems but at this moment we have three NIC's who are "Link Down" (server has 8).

    I tried a reboot of the host but no luck, i can't find any good reason to why they are down but something in the "add host to pool" seems to create this issue?

    Any input in this matter would be apreciated.

     

    Regards Jerry

  • 10-15-2007 9:05 AM In reply to

    Re: NIC's down after adding host to pool?

    When you say that the nic went down, do you mean from the perspective of the VM's or from the host?

     

    Also what type of NIC's are in the box? 

    --
    Charles Craig
    Sr. Technical Support Engineer
  • 10-15-2007 9:15 AM In reply to

    Re: NIC's down after adding host to pool?

    The physical NIC's on the host goes down. (Link down)

    The NIC's are Broadcom GB NIC's. (HP BL45 G2)

     

     

     

  • 10-15-2007 9:20 AM In reply to

    Re: NIC's down after adding host to pool?

    That may be the issue, we normally require an Intel nic on the VR network.  The Broadcom drivers are probably not  responding well to the instructions we are sending it.

     

    I will ask around, but the quick solution is to put an Intel (pro 100) card in to use for the VR network. 

     

    Moderator comment: Surgient does not require Intel NICs for the Default Network (aka VM or VR network), although this is the configuration of hundreds of hosts in Surgient's own datacenter, and it is the most thoroughly tested configuration.

    --
    Charles Craig
    Sr. Technical Support Engineer
    Filed under: , ,
  • 10-15-2007 9:27 AM In reply to

    Re: NIC's down after adding host to pool?

    Considering that HP BL45 is a Bladeserver i see that as a problem...

     

    I wasn't aware of any special hardware requirements conserning the ESX host , are there other things regarding hardware that

    might turn into a problem?

     

    Btw, do you know why the NIC's are down and how to get them up again?

    As it is now i'm looking at a reinstall of ESX to get them up and that is not what i'd hoped for...

  • 10-15-2007 9:29 AM In reply to

    Re: NIC's down after adding host to pool?

    Jerry,

    You can provide us with the error message you encountered in the management console when pooling the host?  Also please provide the output of these commands from the ESX Server console operating system:

     esxcfg-nics -l >/tmp/NICs.log

    esxcfg-vswitch -l >>/tmp/NICs.log

     You can either type "cat /tmp/NICs.log" or transfer /tmp/NICs.log to a PC so you can paste it here.

     Thanks

    Signed by Richard Cardona
  • 10-15-2007 9:45 AM In reply to

    Re: NIC's down after adding host to pool?

    Hi Jerry,

    The Intel NIC limitation of 5.3GA is specific to Microsoft Virtual Server hosts, so disregard that since you're on ESX. It sounds like your switch doesn't like the default configuration for the NAIL server. Do you know anything about the way that your switch is configured? A couple of questions for you:

    • Are the uplink ports trunked or access ports
    • Is STP enabled (needs to be)
    • If so, what's the root bridge priority? (should be < 32768)

    Thanks,

    Ezra 

  • 10-15-2007 9:58 AM In reply to

    Re: NIC's down after adding host to pool?

    Hi, the output is: 

     

    vmnic0  09:00.00 bnx2        Down 0Mbps    Half   Broadcom Corporation Broadcom NetXtreme II BCM5708 1000Base-T
    vmnic1  0b:00.00 bnx2        Up   1000Mbps Full   Broadcom Corporation Broadcom NetXtreme II BCM5708 1000Base-T
    vmnic2  0c:00.00 tg3         Down 0Mbps    Half   Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet
    vmnic3  0f:00.00 tg3         Down 0Mbps    Half   Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet
    vmnic4  42:04.00 tg3         Up   1000Mbps Full   Broadcom Corporation HP NC324i Integrated Dual Port PCI Express Gigabit Server Adapter
    vmnic5  42:04.01 tg3         Up   1000Mbps Full   Broadcom Corporation HP NC324i Integrated Dual Port PCI Express Gigabit Server Adapter
    vmnic6  46:00.00 tg3         Up   1000Mbps Full   Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet
    vmnic7  49:00.00 tg3         Up   1000Mbps Full   Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet
    Switch Name    Num Ports   Used Ports  Configured Ports  Uplinks
    vSwitch0       32          3           1024              vmnic1

      PortGroup Name      Internal ID    VLAN ID  Used Ports  Uplinks
      Service Console     portgroup0     0        1           vmnic1

    Switch Name    Num Ports   Used Ports  Configured Ports  Uplinks
    vSwitch1       64          2           64                vmnic3

      PortGroup Name      Internal ID    VLAN ID  Used Ports  Uplinks
      Virtual Machine Networkportgroup24    0        0           vmnic3

    Switch Name    Num Ports   Used Ports  Configured Ports  Uplinks
    vcodbrolab01-IsolatedNet64          0           64

      PortGroup Name      Internal ID    VLAN ID  Used Ports  Uplinks
      vcodbrolab01-IsolatedNetportgroup25    4095     0

    /Jerry

  • 10-15-2007 10:01 AM In reply to

    Re: NIC's down after adding host to pool?

    Hi, i know just a few things about how the switchside is configured.

    However:

    The uplink ports are access ports atm, they will in the future be trunked.

    STP is enabled on the switchside. (Why does this need to be enabled?)

    Have no idea about that priority but i don't see the relevance?

    Since we are using the port as an access port, we shouldn't affect STP/VLAN or any other configuration between the ESX and the Switch?

    /Jerry

    ps Leaving office for today.

  • 10-15-2007 10:09 AM In reply to

    Re: NIC's down after adding host to pool?

    You're currently operating in Standard mode; in Advanced mode, the NAIL server expects more interaction with the switch (e.g. trunked uplink, dedicated bridge port, etc). As of 5.3 SP1, Standard mode will disregard switch configuration, but for the current release, there's some compatibility verification that needs to be performed (like having STP, a root bridge priority greater than the NAIL servers'). Do you see anything of interest in your switch logs?

  • 10-16-2007 9:30 AM In reply to

    Re: NIC's down after adding host to pool?

    Hi,

     

    Just wanted to report that the matter is being investigated, it will take a little time though.

    Is there any whitepaper that describes these requirements in detail that we need to check btw?

     

    /Jerry

  • 10-16-2007 4:24 PM In reply to

    Re: NIC's down after adding host to pool?

     Hi Jerry,

    A white paper is in progress and will be released soon. We'll announce availability on the forum as soon as it's released. I'll keep an eye on the thread to see if you find anything being reported back from your switch. I'm specifically looking for your switch disliking the STP configuration of the NAIL servers or something indicating that it blocked the port that your vmnic is uplinked to.

     

    Cheers,

    Ezra 

  • 10-17-2007 8:08 AM In reply to

    Re: NIC's down after adding host to pool?

    Hi, we changed a few things to get things further now.

    We are running two ports as trunk as it should be in our prod environment.

    BPU Guard had shutdown the ports earlier it seems, that has been adjusted.

    So right now the ports are up but we get no communication with the nailserver when it deploys.

    In the console it says:

    "This nailservers configuration incorrectly identifies it as the root bridge. Can't continue with this configuration..."

    Our root bridge priority at the time should have been 33154 btw.

    Is there any config changes we need to do to this nailserver or do you expect us to change things on the switchside?

     

    /Jerry

  • 10-17-2007 8:18 AM In reply to

    Re: NIC's down after adding host to pool?

    jehua:

    ... In the console it says:

    "This nailservers configuration incorrectly identifies it as the root bridge. Can't continue with this configuration..."

    Our root bridge priority at the time should have been 33154 btw.

    ....

    /Jerry

     

    Jerry,

    I'm fairly certain that the fact that the switch's priority is higher than the nail server's configured value of 32768 has triggered the nail server to disable its bridge interface to prevent spanning tree loops (safety mechanism).  As Ezra mentioned above, there is a defect in the GA version of 5.3 wherein the spanning tree loop protection is in place even when the nail server is operating in "Standard" mode (it is really only needed in "Advanced" mode).  AFAIK, until 5.3 SP1 is available, you'll need to tweak the root bridge priority of the switch to be lower than 32768, then re-pool your host.

    I'll let Ezra chime in to confirm, but I wanted to get you set up with some answers ASAP.  Sorry for the inconvenience.

     

    - Evan

  • 10-17-2007 9:10 AM In reply to

    Re: NIC's down after adding host to pool?

    Jerry,

    The vmnic's uplink port needs to be configured as an access port, not trunked, with STP enabled, bpdu-filter disabled. The bridge priority needs to be less than 32999.

     

    HTH,

    Ezra 

Page 1 of 2 (19 items) 1 2 Next >