Not able to view CIM data, but alerts work

This forum supports the ESX Host Health Monitor plugin. When posting post screenshots of issues and any script and command logs listed in the probe consoles.
jlester
Posts: 10
Joined: Thu Apr 02, 2020 6:29 pm
3

Not able to view CIM data, but alerts work

Post by jlester »

Running into an issue where we aren't able to view CIM data, but it seems like the data may be there, as we still get occasional alerts. Just makes it a bit harder to troubleshoot the issue.

The main ESX health monitor displays status and last scan date. And if we hover over the statuses, we do see a truncated list of errors (if there are any). But when we right click > View CIM Data, the window is blank.

We're running hosted CWA v2022.8, plugin version 5.0.0.4. Any advice would be much appreciated! Thank you!!

EDIT: we have tried recreating all probes, disabling/re-enabling the plugin, and different CWA users. All yield the same blank CIM page.
esxhm.jpg
esxhm.jpg (98.33 KiB) Viewed 1807 times
esxhm-cim.jpg
esxhm-cim.jpg (40.83 KiB) Viewed 1807 times

User avatar
Cubert
Posts: 2430
Joined: Tue Dec 29, 2015 7:57 pm
8
Contact:

Re: Not able to view CIM data, but alerts work

Post by Cubert »

Screenshot 2022-08-30 085454.png
Screenshot 2022-08-30 085454.png (94.93 KiB) Viewed 1791 times

The fact that you have this in your CIM window says that we are speaking with the ESX host. Otherwise there would be no data here either.

To troubleshoot this issue:
You need to manually test your output to see what is being returned during scans. to do this you can use this post as a guide
viewtopic.php?t=5532

To break down how we scan an agent here is a brief description of the process. The automation process in our plugin schedules a script on each agent you set as the probe to request Status and CIM data from the ESX host that are assigned to that agent. This is done several times a day to test the status of the ESX host.

During the script run it makes two inquiries from the ESX host. The first is current status, this produces what you see in the main ESX host status window. It typically starts with a "OK-" , "Warning -" or a "Alert -" and the physical system type, Serial number and BIOS date. (as seen in image above). The next request asks for the CIM data from the system and stores that back in the database.

From what you posted here it appears that request 2 did not produce any CIM data to save. There could be several reasons for this so we will need to run a few tests on agent to see where its not going right.

Possible issues:

#1 ESX host returns errors instead of CIM data
#2 ESX returns nothing
#3 Script request is corrupted or malformed
#4 Agent timing out or giving up
# etc...

So first I would log into agent and verify that the C:\Python3\check_esxi_hardware.py exists then craft a command line entry that looks like this. If your files are in a different location then use that location in command.

Code: Select all

C:\Python3\Python.exe -W ignore C:\Python3\check_esxi_hardware.py -v -H hostIP -U root -P "ThePassword" 

Passing the -v switch produces the CIM data and not passing it will produce status only. We know that status looks to be working but you can confirm by removing the -v switch.

If this produces an HOST error from ESX then you have issues with ESX host and maybe need to reboot it. If it produces a local agent error (file not found) or (can not make SSL connection due to missing module) would be a Python issue on agent.

The fact we have a status leads us to believe the agent side is ok so I would be looking at the return as no existent or malformed.


Post what you find.

jlester
Posts: 10
Joined: Thu Apr 02, 2020 6:29 pm
3

Re: Not able to view CIM data, but alerts work

Post by jlester »

Awesome, thanks for all the helpful info! I poked around some more and found a bit out:

I ran the python script manually, and it looks like it spewed out all the CIM data we'd expect. It also ran super quickly - took just a few seconds:

Code: Select all

20220906 11:50:41 LCD Status: True
20220906 11:50:41 Chassis Intrusion Status: True
20220906 11:50:41 Connection to https://10.10.10.2
20220906 11:50:41 Found pywbem version 1.4.1
20220906 11:50:41 Check classe OMC_SMASHFirmwareIdentity
20220906 11:50:41   Element Name = System BIOS
20220906 11:50:41     VersionString = 1.5.4
20220906 11:50:41 Check classe CIM_Chassis
20220906 11:50:41   Element Name = Chassis
20220906 11:50:41     Manufacturer = Dell Inc.
20220906 11:50:41     SerialNumber = xxxxxxx
20220906 11:50:41     Model = PowerEdge T430
20220906 11:50:41     Element Op Status = 0
20220906 11:50:41 Check classe CIM_Card
20220906 11:50:42   Element Name = unknown
20220906 11:50:42     Element Op Status = 0
20220906 11:50:42 Check classe CIM_ComputerSystem
20220906 11:50:42   Element Name = System Board 7:1
20220906 11:50:42     Element Op Status = 0
20220906 11:50:42   Element Name = Add-in Card 11:1
20220906 11:50:42     Element Op Status = 0
20220906 11:50:42   Element Name = Add-in Card 11:2
20220906 11:50:42     Element Op Status = 0
20220906 11:50:42   Element Name = Add-in Card 11:3
20220906 11:50:42     Element Op Status = 0
20220906 11:50:42   Element Name = NV-PE430-01
20220906 11:50:42   Element Name = Hardware Management Controller (Node 0)
20220906 11:50:42     Element Op Status = 0
20220906 11:50:42 Check classe CIM_NumericSensor
20220906 11:50:42   Element Name = System Board 1 SYS Usage
20220906 11:50:42     sensorType = 1 - Other
20220906 11:50:42     BaseUnits = 65
20220906 11:50:42     Scaled by = 0.010000
20220906 11:50:42     Current Reading = 5.000000
20220906 11:50:42     Upper Threshold Non Critical = 101.000000
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42   Element Name = System Board 1 MEM Usage
20220906 11:50:42     sensorType = 1 - Other
20220906 11:50:42     BaseUnits = 65
20220906 11:50:42     Scaled by = 0.010000
20220906 11:50:42     Current Reading = 0.000000
20220906 11:50:42     Upper Threshold Non Critical = 101.000000
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42   Element Name = System Board 1 IO Usage
20220906 11:50:42     sensorType = 1 - Other
20220906 11:50:42     BaseUnits = 65
20220906 11:50:42     Scaled by = 0.010000
20220906 11:50:42     Current Reading = 0.000000
20220906 11:50:42     Upper Threshold Non Critical = 101.000000
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42   Element Name = System Board 1 CPU Usage
20220906 11:50:42     sensorType = 1 - Other
20220906 11:50:42     BaseUnits = 65
20220906 11:50:42     Scaled by = 0.010000
20220906 11:50:42     Current Reading = 6.000000
20220906 11:50:42     Upper Threshold Non Critical = 101.000000
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42   Element Name = Processor 1 Temp
20220906 11:50:42     sensorType = 2 - Temperature
20220906 11:50:42     BaseUnits = 2
20220906 11:50:42     Scaled by = 0.010000
20220906 11:50:42     Current Reading = 52.000000
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42   Element Name = System Board 1 Pwr Consumption
20220906 11:50:42     sensorType = 4 - Current
20220906 11:50:42     BaseUnits = 7
20220906 11:50:42     Scaled by = 0.010000
20220906 11:50:42     Current Reading = 0.000000
20220906 11:50:42     Upper Threshold Non Critical = 1092.000000
20220906 11:50:42     Upper Threshold Critical = 1204.000000
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42   Element Name = System Board 1 Fan1
20220906 11:50:42     sensorType = 5 - Tachometer
20220906 11:50:42     BaseUnits = 19
20220906 11:50:42     Scaled by = 0.010000
20220906 11:50:42     Current Reading = 1200.000000
20220906 11:50:42     Lower Threshold Non Critical = 360.000000
20220906 11:50:42     Lower Threshold Critical = 240.000000
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42   Element Name = System Board 1 Inlet Temp
20220906 11:50:42     sensorType = 2 - Temperature
20220906 11:50:42     BaseUnits = 2
20220906 11:50:42     Scaled by = 0.010000
20220906 11:50:42     Current Reading = 24.000000
20220906 11:50:42     Lower Threshold Non Critical = 3.000000
20220906 11:50:42     Upper Threshold Non Critical = 42.000000
20220906 11:50:42     Lower Threshold Critical = -7.000000
20220906 11:50:42     Upper Threshold Critical = 47.000000
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42 Check classe CIM_Memory
20220906 11:50:42   Element Name = CPU1 Level-1 Cache
20220906 11:50:42     Element Op Status = 0
20220906 11:50:42   Element Name = CPU1 Level-2 Cache
20220906 11:50:42     Element Op Status = 0
20220906 11:50:42   Element Name = CPU1 Level-3 Cache
20220906 11:50:42     Element Op Status = 0
20220906 11:50:42   Element Name = Memory
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42 Check classe CIM_Processor
20220906 11:50:42   Element Name = CPU1
20220906 11:50:42     Family = 179
20220906 11:50:42     CurrentClockSpeed = 2400MHz
20220906 11:50:42     Element Op Status = 2
20220906 11:50:42 Check classe CIM_RecordLog
20220906 11:50:43   Element Name = IPMI SEL
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43 Check classe OMC_DiscreteSensor
20220906 11:50:43   Element Name = Memory Device 1 B  4: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 B  4: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 B  4: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 B  3: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 B  3: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 B  3: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 B  2: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 B  2: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 B  2: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 B  1: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 B  1: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 B  1: Presence Detected
20220906 11:50:43   Element Name = Unspecified 1 vFlash 0
20220906 11:50:43   Element Name = Add-in Card 3 SD2 0: Unknown
20220906 11:50:43   Element Name = Add-in Card 3 SD2 0: Unknown
20220906 11:50:43   Element Name = Add-in Card 3 SD2 0: Unknown
20220906 11:50:43   Element Name = Add-in Card 3 SD2 0: Unknown
20220906 11:50:43   Element Name = Add-in Card 3 SD1 0: Unknown
20220906 11:50:43   Element Name = Add-in Card 3 SD1 0: Unknown
20220906 11:50:43   Element Name = Add-in Card 3 SD1 0: Unknown
20220906 11:50:43   Element Name = Add-in Card 3 SD1 0: Unknown
20220906 11:50:43   Element Name = Memory Device 1 A  8: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  8: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  8: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 A  7: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  7: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  7: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 A  6: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  6: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  6: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 A  5: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  5: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  5: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 A  4: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  4: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  4: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 A  3: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  3: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  3: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 A  2: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  2: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  2: Presence Detected
20220906 11:50:43   Element Name = Memory Device 1 A  1: Uncorrectable ECC
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  1: Correctable ECC logging limit reached
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Memory Device 1 A  1: Presence Detected
20220906 11:50:43   Element Name = System Board 1 PCIe Slot3 0: NMI/Diag Interrupt
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PCIe Slot3 0: Bus Timeout
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PCIe Slot3 0: I/O Channel check NMI
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PCIe Slot3 0: Software NMI
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PCIe Slot3 0: PCI PERR
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PCIe Slot3 0: PCI SERR
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PCIe Slot3 0: EISA failsafe timeout
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PCIe Slot3 0: Bus Correctable error
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PCIe Slot3 0: Bus Uncorrectable error
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 OS Watchdog 0: Timer expired
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 OS Watchdog 0: Hard reset
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 OS Watchdog 0: Power down
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 OS Watchdog 0: Power cycle
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Add-in Card 3 Redundancy 0
20220906 11:50:43   Element Name = System Board 1 Intrusion 0: General Chassis intrusion
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Add-in Card 1 ROMB Battery 0: Low
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Add-in Card 1 ROMB Battery 0: Failed
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 CMOS Battery 0: Low
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 CMOS Battery 0: Failed
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 2 Status 0: IERR
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 2 Status 0: Thermal Trip
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 2 Status 0: Configuration Error
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 2 Status 0: Presence detected
20220906 11:50:43   Element Name = Processor 2 Status 0: Throttled
20220906 11:50:43   Element Name = Processor 2 Status 0: Unknown
20220906 11:50:43   Element Name = Processor 1 Status 0: IERR
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 Status 0: Thermal Trip
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 Status 0: Configuration Error
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 Status 0: Presence detected
20220906 11:50:43   Element Name = Processor 1 Status 0: Throttled
20220906 11:50:43   Element Name = Processor 1 Status 0: Unknown
20220906 11:50:43   Element Name = Disk Drive Bay 1 Presence 0: Present
20220906 11:50:43   Element Name = Disk Drive Bay 1 Presence 0: Absent
20220906 11:50:43   Element Name = System Board 1 VGA Cable Pres 0: Connected
20220906 11:50:43     Element Op Status = 12
20220906 11:50:43 Global exit set to WARNING
20220906 11:50:43   Element Name = System Board 1 VGA Cable Pres 0: Config Error
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 USB Cable Pres 0: Connected
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 USB Cable Pres 0: Config Error
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 LCD Cable Pres 0: Connected
20220906 11:50:43     Element Op Status = 12
20220906 11:50:43 Global exit set to WARNING
20220906 11:50:43   Element Name = System Board 1 LCD Cable Pres 0: Config Error
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 Dedicated NIC 0: Present
20220906 11:50:43   Element Name = System Board 1 Dedicated NIC 0: Absent
20220906 11:50:43   Element Name = Add-in Card 3 Presence 0: Present
20220906 11:50:43   Element Name = Add-in Card 3 Presence 0: Absent
20220906 11:50:43   Element Name = Processor 2 Presence 0: Present
20220906 11:50:43   Element Name = Processor 2 Presence 0: Absent
20220906 11:50:43   Element Name = Processor 1 Presence 0: Present
20220906 11:50:43   Element Name = Processor 1 Presence 0: Absent
20220906 11:50:43   Element Name = System Board 1 DIMM PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PS2 PG Fail 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 PS1 PG Fail 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 1.5V PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 FIVR PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 1.5V AUX PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 2.5V AUX PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 1.05V PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 M23 VPP PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 VCORE PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 5V SWITCH PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 M01 VPP PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 M23 VDDQ PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 5V AUX PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 VCCIO PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = System Board 1 3.3V PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 M23 VDDQ PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 M23 VTT PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43   Element Name = Processor 1 M01 VTT PG 0
20220906 11:50:43     Element Op Status = 2
20220906 11:50:43 Check classe OMC_Fan
20220906 11:50:44   Element Name = Fan1
20220906 11:50:44     Element Op Status = 2
20220906 11:50:44 Check classe OMC_PowerSupply
20220906 11:50:44   Element Name = Power Supply 1
20220906 11:50:44     Element Op Status = 2
20220906 11:50:44   Element Name = Power Supply 2
20220906 11:50:44     Element Op Status = 2
20220906 11:50:44 Check classe VMware_StorageExtent
20220906 11:50:44 Check classe VMware_Controller
20220906 11:50:44 Check classe VMware_StorageVolume
20220906 11:50:44 Check classe VMware_Battery
20220906 11:50:44 Check classe VMware_SASSATAPort
 WARNING : System Board 1 VGA Cable Pres 0: Connected  WARNING : System Board 1 LCD Cable Pres 0: Connected  - Server:  Dell Inc. PowerEdge T430 s/n: xxxxxxx System BIOS: 1.5.4 2015-10-05
I also did a 'rescan host' from the ESX main screen. Looking at the logs, the initial status report and the CIM data seem to be returned as well. Here are the three 'info' messages from the script:

Code: Select all

PROBE Ready -> [ 35579 ]

Code: Select all

PROBE -> [ OK - Server: Dell Inc. PowerEdge T430 s/n: xxxxxxx System BIOS: 1.5.4 2015-10-05 ]

Code: Select all

CIM SQL -> [ ( 3 , 3 , 846 , '10.10.10.2' , ' Chassis', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' unknown', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 7:1', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Add-in Card 11:1', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Add-in Card 11:2', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Add-in Card 11:3', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Hardware Management Controller (Node 0)', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 SYS Usage', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 MEM Usage', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 IO Usage', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 CPU Usage', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 Temp', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 Pwr Consumption', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 Fan1', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 Inlet Temp', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' CPU1 Level-1 Cache', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' CPU1 Level-2 Cache', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' CPU1 Level-3 Cache', '0')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' CPU1', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' IPMI SEL', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 B  4: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 B  4: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 B  3: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 B  3: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 B  2: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 B  2: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 B  1: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 B  1: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  8: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  8: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  7: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  7: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  6: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  6: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  5: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  5: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  4: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  4: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  3: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  3: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  2: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  2: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  1: Uncorrectable ECC', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Memory Device 1 A  1: Correctable ECC logging limit reached', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PCIe Slot3 0: NMI/Diag Interrupt', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PCIe Slot3 0: Bus Timeout', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PCIe Slot3 0: I/O Channel check NMI', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PCIe Slot3 0: Software NMI', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PCIe Slot3 0: PCI PERR', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PCIe Slot3 0: PCI SERR', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PCIe Slot3 0: EISA failsafe timeout', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PCIe Slot3 0: Bus Correctable error', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PCIe Slot3 0: Bus Uncorrectable error', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 OS Watchdog 0: Timer expired', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 OS Watchdog 0: Hard reset', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 OS Watchdog 0: Power down', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 OS Watchdog 0: Power cycle', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 Intrusion 0: General Chassis intrusion', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Add-in Card 1 ROMB Battery 0: Low', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Add-in Card 1 ROMB Battery 0: Failed', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 CMOS Battery 0: Low', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 CMOS Battery 0: Failed', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 2 Status 0: IERR', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 2 Status 0: Thermal Trip', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 2 Status 0: Configuration Error', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 Status 0: IERR', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 Status 0: Thermal Trip', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 Status 0: Configuration Error', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 VGA Cable Pres 0: Config Error', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 USB Cable Pres 0: Connected', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 USB Cable Pres 0: Config Error', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 LCD Cable Pres 0: Config Error', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 DIMM PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PS2 PG Fail 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 PS1 PG Fail 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 1.5V PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 FIVR PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 1.5V AUX PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 2.5V AUX PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 1.05V PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 M23 VPP PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 VCORE PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 5V SWITCH PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 M01 VPP PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 M23 VDDQ PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 5V AUX PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 VCCIO PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' System Board 1 3.3V PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 M23 VDDQ PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 M23 VTT PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Processor 1 M01 VTT PG 0', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Fan1', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Power Supply 1', '2')
,( 3 , 3 , 846 ,  '10.10.10.2' , ' Power Supply 2', '2')
  ]
If I query the database manually:

Code: Select all

select * from plugin_p4l_vmware_healthmon_cimdata where ProbeID = 846
It also looks like all the CIM data is there. I did notice there were two "ESXHost" values for that one ProbeID - one for 10.10.10.2 (which we'd expect) and another from an old host. Out of pure reckless abandon, I deleted everything out of the _cimdata table where ProbeID=846. Rescanned, confirmed the _cimdata table was populated with new info... but the CIM window is still blank in the plugin :(

As a last-ditch effort, I ran our Control Center reset script, which basically deletes some registry values, as well as the "c:\programdata\Labtech Plugins" and "C:\ProgramData\Labtech Client\Cache" folder. But alas, still no data in the CIM menu.

Anything else I can try to help narrow down this issue?

thanks again Cubert!!

User avatar
Cubert
Posts: 2430
Joined: Tue Dec 29, 2015 7:57 pm
8
Contact:

Re: Not able to view CIM data, but alerts work

Post by Cubert »

Awesome post!

Ok so... We have valid cim data but no display in CIM window.

Is this the same with all CIM data windows? All agents doing this or just this agent/host?

For it to be an actual plugin display issue it should be doing it for all agents I would suspect.

User avatar
Cubert
Posts: 2430
Joined: Tue Dec 29, 2015 7:57 pm
8
Contact:

Re: Not able to view CIM data, but alerts work

Post by Cubert »

Also in your Control Center config you can set it to also launch a SQL-SPY windows that shows you all the sql queries that are taking place during the navigation on any tool or window in the control center.

You can monitor this window as you open the CIM windows to see the windows query the VMWare DB tables for the cim data.

SQLSpy will allow you to freeze the query log window so you can copy and paste the query that ran into a SQLYog window to test the return of the query.

This will help identify what was requested and what was returned to CIM window.


Turn it on in your Dashboard configs.

Screenshot 2022-09-07 085500.png
Screenshot 2022-09-07 085500.png (74.55 KiB) Viewed 1707 times

It auto launches when you open control center

Screenshot 2022-09-07 090023.png
Screenshot 2022-09-07 090023.png (31.48 KiB) Viewed 1706 times

jlester
Posts: 10
Joined: Thu Apr 02, 2020 6:29 pm
3

Re: Not able to view CIM data, but alerts work

Post by jlester »

Awesome, thanks for the great info! You are an absolute wizard. I'm afraid it's looking more like the plugin just isn't displaying the data tho. I get something like the following with SQL-SPY:

Code: Select all

--> EDTSELECT * FROM plugin_p4l_vmware_healthmon_cimdata a LEFT JOIN plugin_p4l_vmware_healthmon_types t on a.CIM_Value = t.DataValue WHERE a.ProbeID = '449' and a.ESXHost = '192.168.1.9' and t.DataType = 'unknown' 1777.385
And when I run that SQL manually, it does return all the CIM data. And yes, it does seem to be affecting every agent that we've setup as a probe - we only have four total.

Thanks again!!

User avatar
Cubert
Posts: 2430
Joined: Tue Dec 29, 2015 7:57 pm
8
Contact:

Re: Not able to view CIM data, but alerts work

Post by Cubert »

Is it the same from both the global view and the client level view? Both should give you a CIM view . Are both views blank?

User avatar
Cubert
Posts: 2430
Joined: Tue Dec 29, 2015 7:57 pm
8
Contact:

Re: Not able to view CIM data, but alerts work

Post by Cubert »

I have reviewed the code and it should be pretty strait forward. You are making the query and data is returned so unless a field is malformed ( a string where an integer is expected) then all should show up.


So with that in mind, I have built a new 5.0.0.5-pre version that has made a few minor changes to the CIM data viewer.

As we loop through the data rows to build out the view, if any row has an issue you will get a pop up message box with an error message. That's the only difference between the current build and this build. but it will show us if you are having some malformed data.

Here is what good CIM data should look like from SQL Query.

Screenshot 2022-09-13 090803.png
Screenshot 2022-09-13 090803.png (25.38 KiB) Viewed 1664 times

Here is Build 5.0.0.5-prerelease with error messaging.



Up grade your plugin to this and retry viewing CIM data. Make sure to close and relaunch the control center after upgrade.

jlester
Posts: 10
Joined: Thu Apr 02, 2020 6:29 pm
3

Re: Not able to view CIM data, but alerts work

Post by jlester »

Got it! We get the following:

"Column 'Ignore_Alarm' does not belong to table Table."

That message pops up probably a couple hundred times whenever we open a CIM data window. It also appears to pop up for every ESX host monitor we have. Anything I might be able to try to get around this particular hangup?

thanks!!

User avatar
Cubert
Posts: 2430
Joined: Tue Dec 29, 2015 7:57 pm
8
Contact:

Re: Not able to view CIM data, but alerts work

Post by Cubert »

Ah ha!

You have a version 4 table under a version 5 of plugin..

So if you have SQL access run the following in SQL yog

Code: Select all

DROP TABLE IF EXISTS plugin_p4l_vmware_healthmon_cimdata
Afterwards restart the Automate DBAgent to recreate the table as it should be. You will know your successful when table has that column in it.

If your hosted then you may need to have support run the command for you.

Post Reply

Return to “VMWare ESX Host Health Monitor”