How-To Manually Test Probes
Re: How-To Manually Test Probes
I've removed and reinstalled the python script but same result.
I also ran the script manually again on all 4 servers and all of them show the correct results except it's not showing correctly for 2 out of the 4 servers listed in the VMWare ESX Health Monitor window.
ESX05 and ESXi101 are showing the correct status but ESXi102/201 both have battery issues but show okay.
Though the script is showing the correct value so maybe we need to clear out the data table like we did for ESXi101 as that one does show the correct status?
I also ran the script manually again on all 4 servers and all of them show the correct results except it's not showing correctly for 2 out of the 4 servers listed in the VMWare ESX Health Monitor window.
ESX05 and ESXi101 are showing the correct status but ESXi102/201 both have battery issues but show okay.
Though the script is showing the correct value so maybe we need to clear out the data table like we did for ESXi101 as that one does show the correct status?
Re: How-To Manually Test Probes
Hi Cubert,
Any idea or update on a solution?
Thanks.
J.
Any idea or update on a solution?
Thanks.
J.
Re: How-To Manually Test Probes
Sorry J,
I was waiting for the results of
I was waiting for the results of
Did you do this? I am assuming either no or it failed to help?Though the script is showing the correct value so maybe we need to clear out the data table like we did for ESXi101 as that one does show the correct status?
Re: How-To Manually Test Probes
Hi Cubert,
Oh sorry, I missed that and did not do that for the other devices.
let me remove the data from the data table for those other devices and I"ll let you know if tha fixed it.
J.
Oh sorry, I missed that and did not do that for the other devices.
let me remove the data from the data table for those other devices and I"ll let you know if tha fixed it.
J.
Re: How-To Manually Test Probes
I've delete the data from the cim data table for the 4 devices in question and 2 of the 4 are reporting okay but other two are not.
This is me running the script manually and as you can see 3 out of the 4 are returning an error. When I check the plugin_p4l_vmware_healthmon_hosts table it shows 3 out of the 4 have no issues, this does not line up with the script that I ran manually. The CIM data status for the 4 devices do match with the status as shown in the screenshot above which matches the status in the client VMWare ESX Health Monitor screen in Automate.
Why is it not ingesting the failed status into the hosts table and cimdata table?
I might be able to get you the output details from when the Automate script runs if that would help though not sure if that gets us all the details.
This is me running the script manually and as you can see 3 out of the 4 are returning an error. When I check the plugin_p4l_vmware_healthmon_hosts table it shows 3 out of the 4 have no issues, this does not line up with the script that I ran manually. The CIM data status for the 4 devices do match with the status as shown in the screenshot above which matches the status in the client VMWare ESX Health Monitor screen in Automate.
Why is it not ingesting the failed status into the hosts table and cimdata table?
I might be able to get you the output details from when the Automate script runs if that would help though not sure if that gets us all the details.
Re: How-To Manually Test Probes
interesting!!!
OK lets do some script logging. For some reason we either are not getting a successful status or SQL is not updating as expected in script.
Just to give you a bit of scripting logic we are using:
Name of script :P4A ESXi Health Monitor Service
Lines 1 - 29 deal with testing for and installing python3 and any supporting scripts and extensions to probing agent.
Line 30 and 31 test for Python script and make sure its readable as a script.
Line 33 we query for all ESX hosts that are assigned to Probing agent.
Line 34 we start to loop through the query of ESX hosts.
We use several line to validate the ESX data we have before executing the first of to probes against a single ESX host.
** On Line 47 we execute the first probe command which is not in Verbose mode so we should get just a simple return.
This is the command that is returning what your seeing in the last post.
Line 48 we log this output to the agents script log (PROBE -> [ @PROBEDATA@ ])
Good point to verify that data is accurate in script log as SQL insert is the next step.
Line 49 we create a Script variable that hold the SQL Query value
Line 50 we execute the SQL query which at this point should be updating this views top status portion (above CIM data)
Line 51 we execute the same command again but with Verbose turned on which spits out all our CIM data, we then save that to SQL on line 54.
Line 54
So first things to test is, Did we get correct data printed to agents script log for line 48? That will show us what it will be trying to input into SQL.
Then did SQL take that data or did it fail on line 50?
if you add a new script line before 50 that prints out to script log (like line 48)
This will print out the actual SQL query with all data included in query. Copy and paste that query into your SQL. Does SQL fail and if so with what return?
OK lets do some script logging. For some reason we either are not getting a successful status or SQL is not updating as expected in script.
Just to give you a bit of scripting logic we are using:
Name of script :P4A ESXi Health Monitor Service
Lines 1 - 29 deal with testing for and installing python3 and any supporting scripts and extensions to probing agent.
Line 30 and 31 test for Python script and make sure its readable as a script.
Line 33 we query for all ESX hosts that are assigned to Probing agent.
Line 34 we start to loop through the query of ESX hosts.
We use several line to validate the ESX data we have before executing the first of to probes against a single ESX host.
** On Line 47 we execute the first probe command which is not in Verbose mode so we should get just a simple return.
Code: Select all
C:\Python3\Python.exe -W ignore C:\Python3\check_esxi_hardware.py -H @ESXHOSTIP@ -U @ESXUsername@ -P "@ESXPassword@" -V @ESXVender@
Line 48 we log this output to the agents script log (PROBE -> [ @PROBEDATA@ ])
Good point to verify that data is accurate in script log as SQL insert is the next step.
Line 49 we create a Script variable that hold the SQL Query value
Code: Select all
UPDATE plugin_p4l_vmware_healthmon_hosts SET LastScan = NOW(), Status = '@PROBEDATA@' WHERE ESXHost = '@ESXHOSTIP@' and ProbeID = %computerid%
Line 51 we execute the same command again but with Verbose turned on which spits out all our CIM data, we then save that to SQL on line 54.
Code: Select all
C:\Python3\Python.exe -W ignore C:\Python3\check_esxi_hardware.py -v -H @ESXHOSTIP@ -U @ESXUsername@ -P "@ESXPassword@" -V @ESXVender@
Line 54
Code: Select all
INSERT IGNORE INTO `plugin_p4l_vmware_healthmon_cimdata` (`ClientID`,`LocationID`,`ProbeID`,`ESXHost`,`CIM_Item`,`CIM_Value`) VALUES @CIMSQLDATA@ ON DUPLICATE KEY UPDATE CIM_Value=VALUES(CIM_Value)
So first things to test is, Did we get correct data printed to agents script log for line 48? That will show us what it will be trying to input into SQL.
Then did SQL take that data or did it fail on line 50?
if you add a new script line before 50 that prints out to script log (like line 48)
Code: Select all
Here is SQL query -> [ @SCANSQL@ ]
Re: How-To Manually Test Probes
I added the extra line into the script and when I tested it, the results match what the script is producing so it's not an issue with ingesting the data, at least not that I have been able to detect.
Now I have a different issue
The Python script isn't detecting the correct state anymore. I did try to restart the VMware management agent services but that didn't make a difference.
When I run the exact same cmd on the same servers that still have the hardware issue as before, it now returns status okay for 3 out of the 4 servers!
In the vSphere client it does show the correct status so now I'm confiused on why this is happening.
I did double check that we use the latest version of the script and that's indeed the case.
Here are the snipits for each of the stages and it looks like the issue is with the script.
Is there anything else we can do other then maybe running a debug on the Python script or restart the servers?
One other thing that I noticed is that the schedule of checking servers once per hour is no longer happening.
Is there a way to reset the schedule?
Now I have a different issue
The Python script isn't detecting the correct state anymore. I did try to restart the VMware management agent services but that didn't make a difference.
When I run the exact same cmd on the same servers that still have the hardware issue as before, it now returns status okay for 3 out of the 4 servers!
In the vSphere client it does show the correct status so now I'm confiused on why this is happening.
I did double check that we use the latest version of the script and that's indeed the case.
Here are the snipits for each of the stages and it looks like the issue is with the script.
Is there anything else we can do other then maybe running a debug on the Python script or restart the servers?
One other thing that I noticed is that the schedule of checking servers once per hour is no longer happening.
Is there a way to reset the schedule?
Re: How-To Manually Test Probes
The first issue looks to be something with the check_esxi_hardware.py script file. Server shows one thing and ESX CIM shows another.
That's an interesting issue.
We use the guys over at Infiniroot.com Health reader Python script. https://www.claudiokuenzler.com/monitor ... rdware.php
I will need to do some reviews of the docs and such to see why a false reading would come across.
As for the the automation stopping, try restarting the DBagent on Automate. That should reset all the backend automations taking place.
That's an interesting issue.
We use the guys over at Infiniroot.com Health reader Python script. https://www.claudiokuenzler.com/monitor ... rdware.php
I will need to do some reviews of the docs and such to see why a false reading would come across.
As for the the automation stopping, try restarting the DBagent on Automate. That should reset all the backend automations taking place.
Re: How-To Manually Test Probes
How long do you think you need to get the script issue sorted out?
Just so I know if I should put another solution in place until this is sorted out.
The restart of the databae agent fixed this issue, thanks for that.
Just so I know if I should put another solution in place until this is sorted out.
The restart of the databae agent fixed this issue, thanks for that.