GlassFish 3.1.2 - validate-dcom fails with "The remote file, C: doesn't exist" (Centralized Administration with Windows DCOM) - glassfish-3

OS - Windows 2008 server R2 X 2 (firewall disabled on both machines)
I wish to take advantage of GlassFish 3.1.2 Windows DCOM feature to setup communication between GlassFish DAS and a remote node. I've successfully followed Byron Nevins instructions on using GlassFish 3.1.2 DCOM Configuration Utility
However I'm having an issue validating DCOM following the instructions in GlassFish 3.1.2 Guide - 2 Enabling Centralized Administration of GlassFish Server Instances
When I run command validate-dcom --passwordfile C:/Sun/AppServer/password.txt -v 192.168.0.80 I get the following output:
asadmin> validate-dcom --passwordfile C:/Sun/AppServer/password.txt -v 192.168.0.80
remote failure:
Successfully verified that the host, 192.168.0.80, is not the local machine as required.
Successfully resolved host name to: /192.168.0.80
Successfully connected to DCOM Port at port 135 on host 192.168.0.80.
Successfully connected to NetBIOS Session Service at port 139 on host 192.168.0.80.
Successfully connected to Windows Shares at port 445 on host 192.168.0.80.
The remote file, C: doesn't exist on 192.168.0.80 : Logon failure: unknown user name or bad password.
Password file, password.txt, contains a single entry:
AS_ADMIN_WINDOWSPASSWORD=my-windows-password
I have double-checked I can successfully login with my windows password on the remote machine 192.168.0.80. I've also tried this test with two Windows XP professional machines and get the same error.
Also performed this operation by creating a New Node in Admin Console, got the same error:
Cannot figure what is going wrong or what I may be missing
Thanks in advance

I have had similar issues while setting up the new production env. at work last friday, and could not find any useful information on the interwebs, except people encountering the same issue, some with comments as fresh as the day I was looking it up.
So after a rather excessive amount of painful, in-depth debugging, I was able to figure out a few things:
You must explicitly specifiy the local windows user you create for the purpose of running glassfish in both the add-node dialog, and the validate-dcom subcommand (option -w), else it will either default to 'admin' or the user the DAS is running as.
There is a bug in validate-dcom that causes it to ignore whatever you specify as the test directory. No matter what you do it will always use C:\, and result in "access-denied".
The documentation omits another registry key that must be given access to in order for WMI to work
Regarding the first issue, you will most likely encounter it if your nodes are not part of a domain or you are using a local account. Windows NT6+ has a new default security policy that prevents local users from elevating privileges over the network, which causes that test to fail, necessarily, seeing how writing to the root of a system drive not something one can do without elevation.
I previously blogged about it for someone to stumble upon it if needed:
http://www.raptorized.com/2008/08/19/access-administrative-shares-on-server-2008vista/
The gist of it is that you have to navigate to the following registry key:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System
and create a new DWORD named LocalAccountTokenFilterPolicy with a value of 1.
There is no need to reboot, the first, broken test should pass. However you will then see an error about being unable to connect to WMI, and it will fail again.
To remedy this, you must also take ownership and grant your local service account user full control over the following registry key, in addition to the other ones described in the HA Administration Guide:
HKEY_CLASSES_ROOT\CLSID\{76A64158-CB41-11D1-8B02-00600806D9B6}
Afterwards, validate-dcom should report success and you will be able to add it as a node, and create instances on it.
I hope this helps, because the seeming lack of activity from Oracle on that issue was infuriating.
I am also less than pleased by the hackish, ugly, insecure nature of the DCOM support in Glassfish 3 :(

Related

grpc in a ASP core host: context deadline exceeded

I am trying to connect to a grpc service in a ASP Core application that is in a windows 10 computer.
I want to connect with grpui. If I run grpcui in the same computer, without TLS, I can use this way:
grpcui -plaintext localhost:5110
Then I would like to connect from another computer (a virtualbox windows 10). So I use this command:
grpcui -plaintext 192.168.1.2:5110
But I get an error that tells "context deadline exceeded".
If I disabled the firewall in the service computer, then I get another error: "No connection could be made because the target machine actively refused it.". So the problem it seems that firewall in the server computer.
NOTE: I will not pay attention to this second error, I would like to solve first the first one. Later if I need, I will open another question for that, to avoid to mix two different problems in one question.
Then I have add 2 rules in the outbunds, one for the .exe file of the asp application and another for the conhost.exe file in windows\system32. This is because in the taskmanager it seems these are the 2 files that is running when I run the ASP application. I do the same for the inbounds rules.
But the problem is the same.
So which are the rules that I have to set in the firewall to can allow to connect to the service?
Thanks.

Specific IIS user not working with TLS 1.2

We have run into a problem with IIS, TLS 1.2 and domain users. I searched SO and other forums, but all possibly related topics didn't lead me to a solution.
Please don't judge the configuration, it wasn't invented by me, I just need to solve this problem.
What happens is the following:
We have an old web application, that opens an executable with Process.Start and that executable calls an external webservice. This used to work fine with TLS 1.0, but in the near future, the external webservice demands TLS 1.2.
So now we are trying to make this work, and we are almost there: we upgraded the executable's .Net Framework version to 4.7.2 and enabled TLS 1.2 on the Windows Server 2008 R2. The web app's .Net Framework version is set to 4.6.1. It seems to me that this should be everything there is to it.
And indeed, when we run the executable stand alone (not called by the web app) from the server, so owned by the domain user logged on to the server (with RDP), everything works as expected; we receive the proper answer from the web service.
Also, when we call the executable by the web app and in IIS the application pool identity is set to a build in account: ApplicationPoolIdentity, everything works as expected as well.
But, when we set the application pool identity to a dedicated domain account (so a different one than the one that executed the executable earlier), the trouble begins. Connecting the web service fails with the following exception:
System.ServiceModel.EndpointNotFoundException: There was no endpoint
listening at https://<some url>/<some webservice name>.asmx that could
accept the message. This is often caused by an incorrect address or
SOAP action. See InnerException, if present, for more details. --->
System.Net.WebException: Unable to connect to the remote server
---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a
period of time, or established connection failed because connected
host has failed to respond ...
Now the question is of course, what could be causing this?
I like to believe that the failing domain account is configured correctly, but it seems it is not. Or could it be something else, that I don't even know the existence of...
EDIT:
I managed to narrow it down to a permissions issue: when the dedicated domain account runs the application stand alone, it works as it should. When the dedicated account runs it from within the IIS context (started by the web app), it doesn't work, but when the dedicated account is given admin rights, it also works as expected.
That leaves me to the question: what additional permissions does IIS need to allow this setup? Maybe in combination with TLS 1.2 thingies.
Any help would be greatly appreciated.

AWS CodeDeploy vs Windows 2016 in ASG

I use AWS CodeDeploy to deploy builds from GitHub to EC2 instances in AutoScaling Group.
It's working fine for Windows 2012 R2 with all Deployment configurations.
But for Windows 2016 it totally fails on "OneAtTime" deploy;
During "AllAtOnce" deploy only one or two instances deployed successfully, all other fails.
In the logfile on agent this suspicious message is present:
ERROR [codedeploy-agent(1104)]: CodeDeploy Instance Agent Service: CodeDeploy Instance Agent Service: error during start or run: Errno::ETIMEDOUT
- A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. - connect(2)
All policies, roles, software, builds and other stuff are the same, I even tested this on brand new AWS account.
Does anybody faced such behaviour?
I ran into the same problem, but during my investigation, I found out that server's route table had wrong routes for 169.254.169.254 network (there was specified the gateway from the network where my template was captured), so that it couldn't read instance metadata.
From the above error it looks like the agent isn't able to talk to CodeDeploy endpoint after instance starts up. Please check if the routing tables and other proxy related settings are set up correctly. Also if you do not have it already, you can turn on the debug log by setting :verbose to true in the agent config and restart the agent. This would help debug the issue better.

HTTP Error 503. The service is unavailable

I'm struggling to setup the environment in IIS8, I searched a lot but couldn't find a right solution.
I checked the error logs, but no idea.
C:\Windows\System32\LogFiles\HTTPERR
2013-10-09 09:28:39 192.168.43.205 60172 192.168.43.205 80 HTTP/1.1
GET / 503 2 AppOffline qa.hti.local
2013-10-09 09:28:39 192.168.43.205 60192 192.168.43.205 80 HTTP/1.1
GET /favicon.ico 503 2 AppOffline qa.hti.local
Then in Event Viewer:
WARNINGS:
A listener channel for protocol 'http' in worker process '11188'
serving application pool 'qa.hti.local' reported a listener channel
failure. The data field contains the error number.
A listener channel for protocol 'http' in worker process '7492'
serving application pool 'qa.hti.local' reported a listener channel
failure. The data field contains the error number.
A listener channel for protocol 'http' in worker process '9088'
serving application pool 'qa.hti.local' reported a listener channel
failure. The data field contains the error number.
A listener channel for protocol 'http' in worker process '9964'
serving application pool 'qa.hti.local' reported a listener channel
failure. The data field contains the error number.
A listener channel for protocol 'http' in worker process '7716'
serving application pool 'qa.hti.local' reported a listener channel
failure. The data field contains the error number.
I don't understand what the warning means.
ERROR: Application pool 'qa.hti.local' is being automatically disabled due to a series of failures in the process(es) serving that
application pool.
Note: I learned that consecutive 5 failures leads to APP Pool crash, and this can increased. I also tried increasing this but no success.
Please share your thoughts.
I came across this question as I was experiencing a similar issue and searching for a solution.
My problem specifically had to do with our IIS shared configuration. We had enabled a feature in IIS on one of the servers (Http Redirect) that was not installed on any of the others so the server 'features' were out of sync with all the servers.
I was able to resolve the issue by uninstalling the new addition on the first server so it was back to matching the others. An IIS reset later and the AppPools were no longer going down and all was back to normal.
So if you are using IIS Shared Configuration and the IIS is creating 'Service Unavailable' errors and the AppPools are going down, this can be a symptom of the system configuration being out of synch which is corrupting the shared configuration. Hopefully this post will help someone find the solution faster than I was able to.
I had a similar problem, and it was due to another error (2282 IIS-W3SVC-WP) that the pool stopped itself. My problem was that de module owssvr.dll could not be loaded due to a configuration problem.
The solution for me was to set the setting "Enable 32-bit applications" from True to false.
I was deploying a solution on Sharepoint 2013 on a Windows Server 2012 with Visual studio 2013.
I know this was supposedly a very specific problem, but I want to help all those with the same problem.
Propably you do not have the permissions to read the directory.
The directory (and the files) need to have read-access for either windows-group "USERS" or windows-grou "IIS_IUSRS" and also for your apppoolidentity.
This occurred for me too on Windows 10 1803 after an update.
Earlier in the event log there were errors to do with missing DLLS, specifically iiswsock.dll and compstat.dll.
After turning on the following Windows features (under IIS > WWWServices > Performance Features and AppDev Features):
Dynamic and Static Content Compression, and
WebSocket Protocol Windows features
those errors disappeared after an IIS restart.
503 2
Is that 2 the substatus code? If so, you might want to make sure your site is not being hit with excessively (5000+) # of requests.
http://support.microsoft.com/kb/943891
The data field contains the error number.
what's the error number?
This also can be caused by you're software vendor not realizing that they installed the 32bit version of the application pool apps on your 2008 R2 server. After a little troubleshooting my server because they wanted me to reinstall IIS i checked the windows app logs and emailed them the x86 architecture error for their app.
This can occur if a piece of IIS is missing (e.g. one of the many IIS modules has not been installed on the new server).
The thing to do is to carefully compare the IIS config of the source and target, and add the missing pieces to the target.
In a recent IIS8.5->8.5 migration, I had this issue. Went through the whole stack. The last piece was that the Web Cert auth piece of IIS was missing.
To install:
powershell
import-module servermanager
add-windowsfeature Web-Cert-Auth
I reinstalled server and copied applicationHost.config to new server, but not installed corresponding modules, and got this error. I fixed that by check modules in IIS manager and ensure modules are installed.
Update: In Windows Logs-> Application, there're some info about which module is missing.
In case this helps anyone else: We run multiple https sites on an IIS 10 server. One of the steps in configuring a new site is to give the new application pool permission to the system certificate. In the registry, under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SystemCertificates\MY, grant the application pool on the local machine full control. That fixed this issue for us.
I had the same problem and solved it by adding the domain account the application pool was using as an identity to the local group of Administrators on a web server. Perhaps it would also do the trick to grant access to the application directory for the application pool identity account, as #Lisa-Berlin stated above.

Unable to request a user password reset

I am in Plone 4.1.6, if you go in Site setup > Users and Groups and then click the checkbox "Reset Password" for a user and click "Apply changes", than the system hangs and after 5 min I have this error from Apache:
Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request POST /##usergroup-userprefs.
Reason: Error reading from remote server
Apache/2.2.22 (Ubuntu) Server at 192.168.1.4 Port 443
After the error, I have to restart Plone to make Plone respond again.
My Environment:
Plone 4.1.6 (4115)
CMF 2.2.6
Zope 2.13.15
Python 2.6.8 (unknown, Apr 27 2013, 22:01:31) [GCC 4.6.3]
Addons:
Diazo theme support 1.0b8
Installs a control panel to allow on-the-fly theming with Diazo
Thème Plone classique 1.1.2
L'ancien thème utilisé dans Plone 3 et versions antérieures.
Static resource storage 1.0b5
A folder for storing and serving static resource files
I am running Plone behind Aapache
Testing locally
Running a virtual machine with VirtualBox 4.2.12
Plone is install on the Virtual machine
Plone version is 4.1.6
Virtual machine is running Ubuntu 12.04 AMD64
Zeocluster with 2 clients
Email is properly configured in the Plone instance
As I know, everything is working fine with my Plone instance including the other checkboxes available in Users and Groups.
I did a test with ssmtp to send an email to myself from my node on the vm and I have no problem sending the email.
I did try fg mode and everything seems OK.
I did check the Apache logs and everything seems OK too.
If a create a ssh tunnel to avoid Apache and access Plone directly, I don't have a proxy error but the system hang forever.
I don't know what to do to solve this stuff problem. Any idea?
Does the python process use a lot of CPU when it hangs? Check using top.
Install ZopeHealthWatcher, then when it hangs again, use zope health watcher to get a list of what each thread is doing. That will often give you an idea of where the code is sitting either in a loop, in infinite recursion of some sort (this can happen in the zodb, especially with acquisition and similarly named things), or if it is merely blocking on something (eg, mtu issues on the network link to the smtp server for example, so small emails work but big ones hang).
You could also stop the smtp server (or just change the port in the plone control panel) and see if you at least get an exception out of that. Plone should, by default, raise an exception if it cannot connect to the smtp server.
In really extreme cases, you can use gdb to connect to the hanging python process (I usually use "top" to find the one sitting at 100% CPU), and you could then find where it is hanging. This is a lot more complex than using ZopeHealthWatcher, but I successfully traced a hang to a race condition in reportlabs font code recently using precisely this method, it is very powerful. What is nice about gdb, is it stops the process and allows you to step through the code, and up and down the calling stack, unlike ZopeHealthWatcher which just gives you a snapshot (a bit like that Heissenberg uncertainty thing, you can observe where it is now...)

Resources