1、Integrating Windows Compute Cluster Server into a Linux Environment throughPlatform LSFWhite PaperPublished: October 2007, Updated: January 2008AbstractThis white paper gives an overview of the integration of Windows Compute Cluster Server (WCCS) into a Linux environment using a third-party schedule
2、r (Platform LSF) to manage the transfer of work. The white paper focuses on a typical scenario in which jobs submitted to a specified queue in the Platform LSF scheduler (which is deployed on a Linux compute cluster) are seamlessly integrated through Platform LSF to run in the Windows Compute Cluste
3、r Server environment.2 Integrating Windows Compute Cluster Server into a Linux Environment with Platform LSF 2Getting Started3Overview 3System Architecture .4System Requirements and Considerations 5Preliminary Setup .7Install and Configure a Dedicated Active Directory Server7Install Windows Compute
4、Cluster Server8Configure WCCS Using the Compute Cluster Administrator.8Set Up the Network .10Configure Linux for File Sharing10Install Platform LSF on Windows23Preparatory Steps23Run the Installer File25Integrate WCCS with Platform LSF Running on Linux .29Requirements .29Create Users in Active Direc
5、tory and in Windows29Install WCCS and Platform LSF Integration Package 31Register the LSF Passwords.34Configure the Linux LSF Environment.35Operating WCCS from Platform LSF on Linux 37Before You Begin .37Submit Jobs to WCCS from the LSF Linux Cluster37bsub Command 37Executable Residing on WCCS.38Exe
6、cutable Residing on Linux .38Monitor Jobs39Software Maintenance41Update Release Schedule.41Windows Server Update Services.41Summary .42Appendix 1: Set Up Active Directory 43Appendix 2: Remote Installation Services .47Appendix 3: Windows Server Update Services .48Appendix 4: Samba Configuration51Appe
7、ndix 5: Related Links.533 Integrating Windows Compute Cluster Server into a Linux Environment with Platform LSF 3Getting StartedThis white paper describes how to integrate a Microsoft Windows Compute Cluster Server (WCCS) into an existing Linux cluster environment running the Platform LSF job schedu
8、ler. This integration enables Linux users to submit jobs to a Linux-based Platform LSF scheduler; the jobs are ultimately executed on WCCS.Because high-performance computing (HPC) clusters represent a significant investment of resources, maximizing the benefits of your existing investment can result
9、 in huge savings. The interoperability of WCCS with third-party job schedulers like Platform LSF can ensure seamless integration into heterogeneous clusters. If you have an existing Linux cluster and are familiar with Platform LSF, you can continue to use it while adding the power and ease of WCCS t
10、o your environment. WCCS enables you to accomplish more, in less time and with reduced effort, by taking advantage of your users existing skills and integrating with tools they are already using.The main steps for the installation and configuration of WCCS and for the integration of WCCS and the Lin
11、ux cluster are described in this white paper. Details of the procedures, however, are beyond the scope of this document. For step-by-step instructions, refer to the links in Appendix 4.The integration procedures described in this paper have been verified with Platform LSF version 6.2 HPC, SUSE Linux
12、 Enterprise Server 10, and WCCS. The WCCS cluster was deployed with a recommended architecture consisting of a head node with two network interfaces for management and a standalone Active Directory server system.Overview The integration of WCCS into the Linux SUSE environment enables users to submit
13、 jobs to the Platform LSF scheduler for execution on WCCS.When a job is submitted to the Platform LSF schedulers WCCS queue on Linux, it is transferred to the Platform LSF installation on Windows Compute Cluster Server. The Platform LSF installation running on WCCS then authenticates the user and ru
14、ns the job using that users credentials. The following figure shows the procedure.4 Integrating Windows Compute Cluster Server into a Linux Environment with Platform LSF 4P l a t f o r m L S F o n L i n u x t r a n s f e r s j o b t o L S F o n W i n d o w sL i n u x h e a d n o d e w i t h P l a t
15、f o r m L S FW i n d o w s h e a d n o d e w i t h P l a t f o r m L S FU s e r s u b m i t s j o b t o W C C S q u e u e o n L i n u xFigure 1 Job SubmissionIf any files that are specified by the job reside in the Linux environment, Platform LSF will make a Samba connection to the Linux environment
16、 with the credentials of the user submitting the job. The specified files are then transferred to a local directory, and the job is submitted to the WCCS Job Scheduler and executed. Once the job has been executed, specified files are transferred back to the Linux environment.System ArchitectureA typ
17、ical WCCS configuration includes a system dedicated as an Active Directory server, a dedicated compute cluster head node, and one or more compute nodes. The Active Directory server and WCCS cluster head node are set up on the existing Linux cluster network. The head node uses an interior private net
18、work for node management and communication. Message Passing Interface (MPI) traffic can be routed over the private network or via a dedicated interconnect. Although not mandatory, the private cluster network enables use of the Remote Installation Services (RIS) automated deployment system.5 Integrat
19、ing Windows Compute Cluster Server into a Linux Environment with Platform LSF 5Figure 2 Windows Compute Cluster Server ArchitectureSystem Requirements and ConsiderationsThe table below shows the system requirements. Table 1 Windows Compute Cluster Server RequirementsAll Systems 64-bit CPU 512 MB RAM
20、 minimumActive Directory Server Second hard drive recommended for increased Active Directory performance and recoverabilityWCCS Head Node Two network interfaces MPI interconnect optionalSeparate disk partition for use with RISCompute Nodes Pre-boot Execution Environment (PXE) boot enabled for RIS sy
21、stem management6 Integrating Windows Compute Cluster Server into a Linux Environment with Platform LSF 6Integration with Linux requires consideration of several additional factors. These are described in the table below.Table 2 Considerations for Linux IntegrationNetwork Range The recommended Window
22、s Compute Cluster Server implementation requires an IP address range of 192.168.0.x on its internal private cluster network. The external Linux cluster network will need to be set to a different range. It is possible for the Windows Compute Cluster to use a different address range if configured usin
23、g the manually configured Routing and Remote Access Service (RRAS).Samba Samba is required for data to be imported and exported for job execution. It also enables Platform LSF to be managed in both environments from a single shared directory.User Management Samba, as of version 3, cannot act as an A
24、ctive Directory domain controller. In order for Linux users to be able to authenticate, duplicate accounts will need to be created as Active Directory users with identical user names and passwords.7 Integrating Windows Compute Cluster Server into a Linux Environment with Platform LSF 7Preliminary Se
25、tup In order to integrate WCCS with Platform LSF running on Linux as described in this white paper, you will need a working Linux cluster and a working WCCS cluster.Following is a summary of the prerequisite steps for installation and configuration of WCCS. For detailed installation steps, see the W
26、CCS documentation at: http:/ and Configure a Dedicated Active Directory Server Begin by installing and configuring a dedicated Active Directory server. Use the following checklist for your Active Directory server installation and configuration. More detailed steps are given in Appendix 1.Table 3 Act
27、ive Directory ChecklistIn most cases, the default options requested during installation are appropriate for WCCS and should be used.Use of an NTFS-formatted file system is recommended.When given the option to join a workgroup or domain, choose workgroup. The name of the workgroup is not important, a
28、s later during configuration you will create and join a new domain.IP address may be acquired by Dynamic Host Configuration Protocol (DHCP), or may be set to static at this stage. (See instructions below.) Later, you will assign the server a static IP address.When presented with the option to upgrad
29、e the system, choose Express for a shortcut to security updates. Once security updates are installed and the system reboots, select Update this server again, then choose Custom to locate any hardware driver updates.Internet Security Level is set to High by default. You may need to restart the update
30、 process for appropriate update applications to be installed and executed. Security level can be changed in Start Control Panel Internet Options Security Internet Custom Level Reset custom settings to desired level.WCCS Activation may require an Internet connection to Microsoft. Activation takes a f
31、ew seconds and does not require registration. (Note: Depending on the license key used, a direct Internet connection may not be needed.)If there is no CD drive permanently installed, it may be advantageous to copy the WCCS installation disk to the hard drive for easy access to Windows Server applica
32、tions. Choose Start My Computer. Right-click on CD icon. Choose Explore Create new folder on the server, and drag the installation disk contents to it. Set the static IP address, unless this was completed during installation. Set the DNS server address to an available DNS server for your Linux clust
33、er network. This address will be integrated into the DNS server. The DNS address in this control panel will then be changed automatically to the Active Directory localhost at 127.0.0.1.8 Integrating Windows Compute Cluster Server into a Linux Environment with Platform LSF 8Install Windows Compute Cl
34、uster ServerAfter installing Active Directory and DNS, use the following checklist to install WCCS.Table 4 Windows Compute Cluster Server ChecklistConfigure the head node. Set the static IP address on interface connected to Linux cluster network.Set the Active Directory server as the primary DNS.On
35、the head node, join the Active Directory domain. Choose Start My Computer, then right click Properties, select Computer Name Change, and then type the domain name.On the head node, install the Microsoft Compute Cluster Pack (CCP). Install with the default options. Particularly note that the default
36、option to create a new compute cluster should be chosen.Install WCCS on the compute nodes using RIS or an alternative remote deployment method. For more information about RIS, see Appendix 2.Configure WCCS Using the Compute Cluster AdministratorNext, configure WCCS with the Compute Cluster Administr
37、ator. This is the main control interface for WCCS, and is shown in the following figure. The Compute Cluster Administrator can be found in: Start All Programs Microsoft Compute Cluster Pack Compute Cluster Administrator. (Do not confuse this with a separate application in Administrative Tools that i
38、s labeled Cluster Administrator.)Figure 3 Compute Cluster Administrator9 Integrating Windows Compute Cluster Server into a Linux Environment with Platform LSF 9The Compute Cluster Administrator provides access to the following:Table 5 Compute Cluster AdministratorTo Do List Use the To Do List to com
39、plete configuration of the WCCS cluster.Node ManagementSelect Node Management from the left pane of the Compute Cluster Administrator. The center pane will now display each node. A node must be approved before joining the cluster. Select each node and then Approve from the right pane of the window.
40、The nodes status will change to Paused. Next, select Resume from the right pane to fully activate the node. Remote Desktop SessionsSelect Launch Remote Desktop Connection from the right pane and a desktop logon for that node will be shown in the center pane. User logon is authenticated according to
41、the users and policies of the Active Directory domain.WCCS ActivationEach installation of Windows Compute Cluster Server requires activation within 14 days of installation. On the desktop of each node, select the keys icon in the bottom right corner of the screen. An activation dialog box will be in
42、itiated. After 14 days without activation, the node will not be accessible.(Note: There are variations depending on the key and the operating system version used.)Compute Cluster Job SchedulerSelect Launch Compute Cluster Job Scheduler from the right pane. The Job Queue window will open to show the
43、jobs submitted, running, and completed by the cluster. Jobs can also be submitted and managed from here.10 Integrating Windows Compute Cluster Server into a Linux Environment with Platform LSF 10Set Up the NetworkUse the following checklist for your network setup. Table 6 Network SetupSelect the set
44、up type. Note that the compute nodes are isolated on a private network.Configure the public network. Choose Local Area Connection with an external-facing address as previously set.Configure the private network. Choose Local Area Connection connected to the WCCS compute nodes.Enable the Network Addre
45、ss Translation (NAT) using Internet Connection Sharing (ICS). Choose Enable Internet Connection Sharing.Configure Linux for File SharingWhile the procedure that follows was tested on RedHat 4 systems, it is believed that this procedure will work on any Linux system that support Samba 3.x.To establis
46、h a shared file system accessible by the WCCS, you must have Kerberos 5, Winbind and Samba installed and properly configured. In addition, you must configure PAM and nsswitch.conf to authenticate against the Active Directory server.Install and Configure KerberosDepending on the installation options
47、selected, Kerberos 5 may have been installed when the operating system was installed. To find out if Kerberos is installed run (on RedHat Linux):# rpm -qa | grep krbkrb5-workstation-1.3.4-17krb5-auth-dialog-0.2-1krb5-libs-1.3.4-17krbafs-1.2.2-6krb5-devel-1.3.4-17pam_krb5-2.1.8-1krbafs-devel-1.2.2-6A
48、t a minimum, you will krb5-workstation, krb5-libs, and pam_krb5.11 Integrating Windows Compute Cluster Server into a Linux Environment with Platform LSF 11If Kerberos is not installed the source can be obtained from http:/web.mit.edu/Kerberos. Alternatively, many Linux vendors provide Kerberos packa
49、ges that you can download and install. The RedHat distribution (and presumably other vendor supplied packages) of Kerberos is a standard compile of the Kerberos distribution by MIT packaged in rpm format.Once Kerberos is installed, it needs to be configured. Kerberos is configured by editing the /etc/krb5.conf file. In the example below the Active Directory server AD.WCCS.ZOHALLT.COM is the domain controller for the domain WCCS.ZOHALLT.COM.loggingdefault = FILE:/var/log/