Azure Service Fabric MultiMachine Windows X509 Cluster - Timed out waiting for Installer Service to complete for machine vm1

0 votes

Need some advice, any help really appreciated.

Trying to create a standalone service fabric cluster as https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-creation-for-windows-server

To be specific a Windows.X509.MultiMachine one.

So, i have an Active Directory domain, it's certificate. One machine as domain controller and three nodes I want to create a cluster of.

A TestConfiguration.ps1 powershell script indicates that everything is ok, but CreateServiceFabricCluster.ps1 takes a lot time and throws an error "Timed out waiting for Installer Service to complete for machine DEV1." etc. DiagnosticsStore is empty.

DeploymentTrace:

2018/02/14-13:58:51.837,Info,5084,SystemFabricDeployer.SFDeployer,Running Best Practices Analyzer...
2018/02/14-13:58:51.844,Verbose,5084,SystemFabricDeployer.SFDeployer,Validating executing user is an Administrator.
2018/02/14-13:58:51.850,Verbose,5084,SystemFabricDeployer.SFDeployer,Converting JSON config to model.
2018/02/14-13:58:52.200,Error,5084,SystemFabricDeployer.SFDeployer,Config validation: Server Certificate Thumbprint contains invalid characters
2018/02/14-13:58:52.201,Verbose,5084,SystemFabricDeployer.SFDeployer,Validating CAB file is valid at C:\Users\Administrator\Documents\mssf\DeploymentRuntimePackages\MicrosoftAzureServiceFabric.6.1.456.9494.cab.
2018/02/14-13:58:52.591,Error,5084,SystemFabricDeployer.SFDeployer,Best Practices Analyzer determined environment has an issue. Please see additional BPA log output in DeploymentTraces folder.
2018/02/14-13:58:52.592,Error,5084,SystemFabricDeployer.SFDeployer,Cluster Setup cancelled due to validation error(s) found by Best Practices Analyzer. Inspect details in DeploymentTraces log folder local to executing location.

EventLog:

2/14/2018 6:52:33 AM - DEV1 - Error - Timed out waiting for Installer Service to complete for machine DEV1. Investigation order: FabricInstallerService -> FabricSetup -> FabricDeployer -> Fabric
2/14/2018 6:52:33 AM - DEV1 - Error - Timed out waiting for Installer Service to complete for machine DEV2. Investigation order: FabricInstallerService -> FabricSetup -> FabricDeployer -> Fabric
2/14/2018 6:52:33 AM - DEV1 - Error - Timed out waiting for Installer Service to complete for machine DEV3. Investigation order: FabricInstallerService -> FabricSetup -> FabricDeployer -> Fabric
2/14/2018 6:52:49 AM - DEV1 - Error - federation open failed with FABRIC_E_TIMEOUT
2/14/2018 6:52:49 AM - DEV1 - Error - Fabric Node open failed with error code = FABRIC_E_TIMEOUT
2/14/2018 6:52:52 AM - DEV1 - Error - Target information file exists. This would indicate that Fabric node open or Fabric uninstall didn't happen successfully. Rolling back..
2/14/2018 6:57:50 AM - DEV1 - Error - federation open failed with FABRIC_E_TIMEOUT
2/14/2018 6:57:50 AM - DEV1 - Error - Fabric Node open failed with error code = FABRIC_E_TIMEOUT
2/14/2018 7:02:51 AM - DEV1 - Error - federation open failed with FABRIC_E_TIMEOUT
2/14/2018 7:02:51 AM - DEV1 - Error - Fabric Node open failed with error code = FABRIC_E_TIMEOUT
2/14/2018 7:04:24 AM - DEV1 - Error - CreateCluster Error: System.AggregateException: One or more errors occurred. ---> System.ServiceProcess.TimeoutException: Timed out waiting for Installer Service to complete for machine DEV3. Investigation order: FabricInstallerService -> FabricSetup -> FabricDeployer -> Fabric
   at Microsoft.ServiceFabric.DeploymentManager.DeploymentManagerInternal.StartAndValidateInstallerServiceCompletion(String machineName, ServiceController installerSvc)
   at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
   at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
   at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object )
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Func`4 bodyWithLocal, Func`1 localInit, Action`1 localFinally)
   at System.Threading.Tasks.Parallel.ForEachWorker[TSource,TLocal](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
   at System.Threading.Tasks.Parallel.ForEach[TSource](IEnumerable`1 source, Action`1 body)
   at Microsoft.ServiceFabric.DeploymentManager.DeploymentManagerInternal.RunFabricServices(List`1 machines, FabricPackageType fabricPackageType)
   at Microsoft.ServiceFabric.DeploymentManager.DeploymentManagerInternal.<CreateClusterAsyncInternal>d__1.MoveNext()
---> (Inner Exception #0) System.ServiceProcess.TimeoutException: Timed out waiting for Installer Service to complete for machine DEV3. Investigation order: FabricInstallerService -> FabricSetup -> FabricDeployer -> Fabric
   at Microsoft.ServiceFabric.DeploymentManager.DeploymentManagerInternal.StartAndValidateInstallerServiceCompletion(String machineName, ServiceController installerSvc)
   at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
   at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
   at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object )<---

---> (Inner Exception #1) System.ServiceProcess.TimeoutException: Timed out waiting for Installer Service to complete for machine DEV1. Investigation order: FabricInstallerService -> FabricSetup -> FabricDeployer -> Fabric
   at Microsoft.ServiceFabric.DeploymentManager.DeploymentManagerInternal.StartAndValidateInstallerServiceCompletion(String machineName, ServiceController installerSvc)
   at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
   at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
   at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object )<---

---> (Inner Exception #2) System.ServiceProcess.TimeoutException: Timed out waiting for Installer Service to complete for machine DEV2. Investigation order: FabricInstallerService -> FabricSetup -> FabricDeployer -> Fabric
   at Microsoft.ServiceFabric.DeploymentManager.DeploymentManagerInternal.StartAndValidateInstallerServiceCompletion(String machineName, ServiceController installerSvc)
   at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
   at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
   at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object )<---

My nodes resolvable names are DEV1, DEV2, DEV3, domain "hp.dev", it's netbios name HPDEV. All of them are on Windows Server 2016 Standard if it'll help.

Here is a ClusterConfig.json:

{
"name": "hP Secure Cluster",
"clusterConfigurationVersion": "1.0.0",
"apiVersion": "10-2017",
"nodes": [
    {
        "nodeName": "Node1",
        "iPAddress": "DEV1",
        "nodeTypeRef": "NodeTypeDefault",
        "faultDomain": "fd:/dc1/r1",
        "upgradeDomain": "UD1"
    },
    {
        "nodeName": "Node2",
        "iPAddress": "DEV2",
        "nodeTypeRef": "NodeTypeDefault",
        "faultDomain": "fd:/dc1/r2",
        "upgradeDomain": "UD2"
    },
    {
        "nodeName": "Node3",
        "iPAddress": "DEV3",
        "nodeTypeRef": "NodeTypeDefault",
        "faultDomain": "fd:/dc1/r3",
        "upgradeDomain": "UD3"
    }
],
"properties": {
    "diagnosticsStore": {
        "metadata": "Please replace the diagnostics file share with an actual file share accessible from all cluster machines.",
        "dataDeletionAgeInDays": "7",
        "storeType": "FileShare",
        "connectionstring": "C:\\ProgramData\\SF\\DiagnosticsStore"
    },
    "security": {
        "metadata": "The Credential type X509 indicates this is cluster is secured using X509 Certificates. The thumbprint format is - d5 ec 42 3b 79 cb e5 07 fd 83 59 3c 56 b9 d5 31 24 25 42 64.",
        "ClusterCredentialType": "Windows",
        "ServerCredentialType": "X509",
        "WindowsIdentities": {
            "ClusterIdentity": "HPDEV\\Administrator"
        },
        "CertificateInformation": {
            "ServerCertificateCommonNames": {
                "CommonNames": [
                    {
                        "CertificateCommonName": "HPCA",
                    }
                ],
                "X509StoreName": "My"
            }
        }
    },
    "no
Mar 4, 2022 in Azure by Edureka
• 13,620 points
597 views

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.

Related Questions In Azure

0 votes
1 answer

How to create a service connection for Azure in Azure Devops (with pictures)

to create a service connection for Azure ...READ MORE

answered Mar 29, 2022 in Azure by Edureka
• 12,690 points

edited Jul 4, 2023 by Khan Sarfaraz 8,973 views
0 votes
0 answers

Windows Azure Portal login to portal and receive error "We are having trouble logging you into the portal"

Open browser Navigate to http://www.windowsazure.com/en-us/ Select portal top right login with ...READ MORE

Mar 2, 2022 in Azure by Edureka
• 13,620 points
775 views
0 votes
1 answer

Azure Pricing Calculator for Hours in Cloud Service

The best method to understand Cloud Service ...READ MORE

answered Mar 29, 2022 in Azure by Edureka
• 12,690 points
814 views
0 votes
1 answer
0 votes
1 answer

deploy sample springboot app (WebGoat) to Azure app service

In APPLICATION SETTINGS you need to set the key/value ...READ MORE

answered Mar 24, 2022 in Azure by Edureka
• 13,620 points
1,821 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP