Allow me to preface this by saying that I absolutely love this plugin. However, we are receiving some error messages when adding hosts to the plugin that I'd appreciate your assistance with.
Error #1: UNKNOWN: Authentication Error
I have a few servers which are reporting this error. I've tried to perform due diligence and rule out user error. In each case I've logged on to a local system on the network, connected to the ESX host with vSphere and verified the credentials entered in the plugin are valid and allow login. Anything I'm missing here?
Error #2: Plugin never scans host
I've got 1 or 2 of these as well. I've entered the information into the plugin but the host is never scanned. There are no failure messages for these. Again, I've logged onto a system on the local network and connected to the ESX host via vSphere to make sure that I have the IP and credentials correct.
Error #3: OK
I have 2 of these, where the host appears to have been scanned, but the only response that I receive is "OK". Any more verbose logging or error syntax here?
Error #4: ConnectionError: Socket error: [Errno 10061] No connection could be made because the target machine actively refused it
I have several of these as well. Again, in all cases I've connected to the local network and then logged in to the ESX host with vSphere. My first guess is that there are services or daemons that aren't running, but I'm not sure which the plugin is requiring. I'm experiencing this on ESX hosts with and without SSH enabled. What can I do to correct this?
Error Messages and how to resolve
Re: Error Messages and how to resolve
Looks like you have several possible issues. If what I say here does not shed some light on your issues then email us at Helpdesk@plugins4labtech.com to open a ticket with us so we can investigate further.
with the latests PYWBEM builds for Python(v10) fixed issues with connecting to newer SSL services like ESX 6.5 + it also came with a flaw. That flaw is that any probe can only query 1 server now per script run. The reason is kinda shadowy but looks to be based on a cache that the LT script holds during the execution at the agent. This cache causes the first SSL connection to work but the second fails with typically "No connection could be made because the target machine actively refused it" which looks to be number 4 on your list. We also see this behavior in your #2 authentication failures in some instances.
The fix is to have 2 probes at location, one for each ESX host. We hope the maintainers of PYWBEM might resolve this in future releases as it worked prior in the older ones (v6).
We also see that ESX 6.5 comes with CIM turned off by default. You must first administratively enable it at the command line then start up the services for it. Just starting services will look like it worked but the backend services will fail to actually start.
see this post to fix http://www.squidworks.net/2017/02/vmwar ... y-default/
Lastly the best tool for testing is the actual probe. to test that the probe actually works and can make the connections correctly you can do 3 things.
#1 run the probe's command manually to see the output yourselves. replace @XXX@ with the real account data to access ESXhost.
#2 using Telnet, Telnet to port 5989 and see if you get a rejection or a failure to open port. If it is working correctly you will connect and get a prompt in Telnet. If you do not start looking at CIM services and possible firewalls running on ESX.
#3 Python install is missing files, make sure your AV software does not stop zipfile downloads from lp.plugins4labtech.com. If a proxy is being used (Baracuda) then allow SSL and HTTP to lp.plugins4labtech.com. Delete the python folder and allow probe to ren a scan. It will see python missing and will reinstall all parts of the probe again. Typically solves probe install issues.
Do not point ESX Probe at a VCenter server. I do not believe that will work.
Let me know if any of this helps.
with the latests PYWBEM builds for Python(v10) fixed issues with connecting to newer SSL services like ESX 6.5 + it also came with a flaw. That flaw is that any probe can only query 1 server now per script run. The reason is kinda shadowy but looks to be based on a cache that the LT script holds during the execution at the agent. This cache causes the first SSL connection to work but the second fails with typically "No connection could be made because the target machine actively refused it" which looks to be number 4 on your list. We also see this behavior in your #2 authentication failures in some instances.
The fix is to have 2 probes at location, one for each ESX host. We hope the maintainers of PYWBEM might resolve this in future releases as it worked prior in the older ones (v6).
We also see that ESX 6.5 comes with CIM turned off by default. You must first administratively enable it at the command line then start up the services for it. Just starting services will look like it worked but the backend services will fail to actually start.
see this post to fix http://www.squidworks.net/2017/02/vmwar ... y-default/
Lastly the best tool for testing is the actual probe. to test that the probe actually works and can make the connections correctly you can do 3 things.
#1 run the probe's command manually to see the output yourselves. replace @XXX@ with the real account data to access ESXhost.
Code: Select all
C:\Python27\Python.exe -W ignore C:\Python27\check_esxi_hardware.py -H @ESXHOSTIP@ -U @ESXUsername@ -P "@ESXPassword@" -V @ESXVender@
#3 Python install is missing files, make sure your AV software does not stop zipfile downloads from lp.plugins4labtech.com. If a proxy is being used (Baracuda) then allow SSL and HTTP to lp.plugins4labtech.com. Delete the python folder and allow probe to ren a scan. It will see python missing and will reinstall all parts of the probe again. Typically solves probe install issues.
Do not point ESX Probe at a VCenter server. I do not believe that will work.
Let me know if any of this helps.
Re: Error Messages and how to resolve
As for #3 of your list. OK is just that ESX is A-OK! otherwise it will be Warning (Yellow) or FAILURE (Red). By right clicking the agent and selecting view CIM data you should get the verbose view of the ESX and any items in error will be marked with a colored dot
Re: Error Messages and how to resolve
Thank you for your replies. Sorry for my delayed response, but I just now had a chance to troubleshoot this further.
There were a few systems which had CIM Server set to start and stop with host, but the service had stopped. Restarting the service via cli fixed those.
On the servers that were having authentication issues, I was able to manually run the script on each. In each case the script would run successfully. However, on looking at it closer, I noticed that each of servers which returned an authentication error for the automated script had either an "@" or multiple "!" characters in the password. The password worked in the manual script because it was enclosed in quotes, but it doesn't look like the Automate script pipes that through properly. In each case I changed the password to remove those characters and now the Automate script works fine.
Thanks again for all your help!
There were a few systems which had CIM Server set to start and stop with host, but the service had stopped. Restarting the service via cli fixed those.
On the servers that were having authentication issues, I was able to manually run the script on each. In each case the script would run successfully. However, on looking at it closer, I noticed that each of servers which returned an authentication error for the automated script had either an "@" or multiple "!" characters in the password. The password worked in the manual script because it was enclosed in quotes, but it doesn't look like the Automate script pipes that through properly. In each case I changed the password to remove those characters and now the Automate script works fine.
Thanks again for all your help!
Re: Error Messages and how to resolve
why is the icon not the green check mark?
the grey ellipses otherwise indicate an error so having the same icon for some OK statuses keeps me from doing a quick visual scan to check error state.
Also, it would be nice to have the plugin generate a plugin related ticket for when the status is not OK so I wouldn't have to visually scan it to see if the script failed.
Re: Error Messages and how to resolve
There should be an internal monitor that watches the status and cim data status areas and when either fails it should "do something" you have to set that something up in the monitor.
Turn on monitoring in plugin then look for the 3 P4L- CIM monitors
Turn on monitoring in plugin then look for the 3 P4L- CIM monitors
Re: Error Messages and how to resolve
And yes,
That OK is not a good ok in step #3 That is a script timeout from what I suspect is the OK coming from LT agent saying well we just gave up...
The question is what was it doing just before that?
That OK is not a good ok in step #3 That is a script timeout from what I suspect is the OK coming from LT agent saying well we just gave up...
The question is what was it doing just before that?
Re: Error Messages and how to resolve
To figure that out we need to look at the probe 's command and script logs. see what it was doing and why it returned a "OK".
Post some of that here so we can have a peek
Post some of that here so we can have a peek