SystemAdministration/TheRiseAndFallOfNewNodeManagement

From SoylentNews
Jump to: navigation, search

For those who opened this sacred tomb, take a moment to decide if you wish to truly proceed. There are better things to do with your life like walk around the world, or learn to play pinball with your feet. If you truly wish to proceed, remember that their be dragons here.

Initial Setup

(this guide assumes we're using Ubuntu 12.04 and are on Linode, most of this is still relevent in general, but ignore the linode bits)

Once a new node is created on Linode, you need to deploy Ubuntu 12.04, this can "Dashboard" tab easily. Make sure you give 512M of swap, power it up, write down the root password, then open a console. We've got work to do.

On helium, in the root home directory, there's a folder called deployment_kit which has all the files you need to copy in place.

Install All Updates

Linode's image is a bit out of date, so a quick upgrade is needed, first you need to update the package index

Last login: Sat Mar 22 22:02:47 2014
root@localhost:~# apt-get update
Get:1 http://mirrors.linode.com precise Release.gpg [198 B]
Get:2 http://mirrors.linode.com precise-updates Release.gpg [198 B]
Get:3 http://mirrors.linode.com precise-backports Release.gpg [198 B]
Get:4 http://mirrors.linode.com precise-security Release.gpg [198 B]
-SNIP-

Then install updates

root@localhost:~# apt-get dist-upgrade
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  accountsservice apport apt apt-transport-https apt-utils apt-xapian-index base-files bash-completion bc bind9-host curl dbus dmsetup dnsutils dosfstools
  dpkg file gnupg gpgv grub-common ifupdown initramfs-tools initramfs-tools-bin iproute isc-dhcp-client isc-dhcp-common landscape-common language-pack-en
  language-pack-en-base language-selector-common libaccountsservice0 libapt-inst1.4 libapt-pkg4.12 libasn1-8-heimdal libbind9-80 libc-bin libc6 libcurl3
  libcurl3-gnutls libdbus-1-3 libdevmapper1.02.1 libdns81 libdrm-intel1 libdrm-nouveau1a libdrm-radeon1 libdrm2 libgcrypt11 libglib2.0-0 libgnutls26
  libgssapi3-heimdal libhcrypto4-heimdal libheimbase1-heimdal libheimntlm0-heimdal libhx509-5-heimdal libisc83 libisccc80 libisccfg82 libkrb5-26-heimdal
  libldap-2.4-2 liblockfile-bin liblockfile1 liblwres80 libmagic1 libpci3 libplymouth2 libpolkit-gobject-1-0 libpython2.7 libroken18-heimdal libssl1.0.0
  libudev0 libwind0-heimdal libxcb1 libxml2 lsb-base lsb-release multiarch-support openssl pciutils perl perl-base perl-modules plymouth
  plymouth-theme-ubuntu-text procps python python-apport python-apt python-apt-common python-httplib2 python-lazr.restfulclient python-minimal
  python-openssl python-problem-report python2.7 python2.7-minimal rsyslog sudo tzdata udev unzip update-manager-core w3m xkb-data
103 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 44.5 MB of archives.
After this operation, 19.5 kB of additional disk space will be used.
Do you want to continue [Y/n]? 

This takes about 5-10 minutes. Drink a soda, and contemplate life ...

Set Hostname

Hostnames should be setup with the next item on the [HostnamePolicy|Hostname Policy]. On Ubuntu, you need to edit /etc/hostname, and /etc/hosts. You also need to add the LDAP server IP to the hosts file so it will function even if DNS is down

root@localhost:~# cat /etc/hostname 
boron
root@localhost:~# cat /etc/hosts
127.0.0.1	boron.li684-22 boron localhost
LDAP-IP         ldap-server.li694-22

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Load the new hostname with 'hostname -F'

root@localhost:~# hostname -F /etc/hostname 
root@localhost:~# 

Note, prompt won't change until to log out/log back in.

Switch Over To Distro Kernels

Linode uses a customized kernel instead of stock Ubuntu kernels. While this works "well enough" for most people, it lacks AppArmor, and cause unexpected splats as it doesn't have a ramdisk.

Here's Linode's guide on how to fix it: https://library.linode.com/custom-instances/pv-grub-howto

When you're done, uname -a should say something like this

root@boron:~# uname -a
Linux boron 3.2.0-60-virtual #91-Ubuntu SMP Wed Feb 19 04:13:28 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Setup Networking

On the Linode panel, make sure the node has an internal IP address so that other nodes in the data centre can access it, then note it. You have to setup static IP address configure. Linode has a decent guide for this, but the quick and dirty version is you need to edit /etc/network/interfaces to look like this

# The loopback interface
auto lo
iface lo inet loopback

# Configuration for eth0 and aliases

# This line ensures that the interface will be brought up during boot.
auto eth0 eth0:0 eth0:1

# eth0 - This is the main IP address that will be used for most outbound connections.
# The address, netmask and gateway are all necessary.
iface eth0 inet static
 address PUBLIC-IP-HERE
 netmask 255.255.255.0
 gateway   GATEWAY-HERE
# eth0:0
# This is a the private IP address.
iface eth0:0 inet static
 address INTERNAL-IP-HERE
 netmask 255.255.128.0

You can apply the new IP configuration with this

root@boron:~# ifdown eth0 && ifup eth0 eth0:0
resolvconf: Error: /etc/resolv.conf isn't a symlink, not doing anything.
resolvconf: Error: /etc/resolv.conf isn't a symlink, not doing anything.
ssh stop/waiting
ssh start/running, process 1087
resolvconf: Error: /etc/resolv.conf isn't a symlink, not doing anything.
ssh stop/waiting
ssh start/running, process 1127
root@boron:~#  

Then, kill the dhcp client by doing this:

root@boron:~# pidof dhclient
518
root@boron:~# kill 518 # use the value you got from pidof

You can then apply the new resolver config by setting /etc/resolv.conf to this:

domain li694-22
nameserver 192.168.174.17
nameserver 192.168.170.201
nameserver 72.14.188.5

Finally, you should install some useful packages to have which are missing on the stock linode config

root@boron:~# apt-get install command-not-found python-software-properties

Setting up LDAP

You need the reader password and the slapd_ca.pem, pam-configs_mkhomedir ssh_ldap.sh files from the deployment kit now

root@boron:~# apt-get install ldap-auth-client libpam-ldap ldap-utils

When asked configuration questions, here's what you enter:

  • LDAP server identifer: ldap://ldap-server.li694-22/ ldap://ldap-slave01.li694-22/
  • Distiquished Name: dc=li694-22
  • LDAP version to use: 3
  • Make local Root admin: No
  • Does LDAP require login: Yes
  • LDAP username: cn=ldapReader,dc=li694-22
  • LDAP password is in the deployment kit

(if you make a mistake; type dpkg-reconfigure libpam-ldap to re-run the wizard)

Purge away nscd, we don't need it, and it causes issues (it gets auto-installed by libpam-ldap)

root@boron:~# apt-get purge nscd

Now, take the slapd_ca.pem, and stick it in /usr/share/ca-certificates/li694-22 (you have to make this folder)

Open up /etc/ca-certificates.conf in your favorite editor, add the following to the end

li694-22/slapd_ca.pem


You need to now install the certificate into the system. Just run update-ca-certificates

root@boron:/usr/share/ca-certificates/li694-22# update-ca-certificates 
Updating certificates in /etc/ssl/certs... 1 added, 0 removed; done.

Then open /etc/ldap.conf with your favorite editor

Find and uncomment:

#ssl start_tls

Open /etc/ldap/ldap.conf, and set the following:

BASE    dc=li694-22      
URI     ldap://ldap-server.li694-22/ ldap://ldap-slave01.li694-22 

ldap should be setup now, now you just need to enable it in PAM, and update NSS. First, you need to copy pam-configs_mkhomedir to its proper place

root@boron:~# cp pam-configs_mkhomedir /usr/share/pam-configs/mkhomedir

Then update PAM. PAM should list "Active mkhomedirs" as an option if the config file was properly setup.

root@boron:~# pam-auth-update 
root@boron:~# auth-client-config -t nss -p lac_ldap

You should be able to run id and get valid results at this point

root@boron:~# id mcasadevall
uid=2500(mcasadevall) gid=2501(sysops) groups=2501(sysops),2500(firefighters),2502(db)

Setup Machine ACLs

Setup Sudoers

Now we need to setup the ability to sudo, and to limit who can access a box

Sudo configuration is dependent on the node, and its role, but every box should have the following line amended to sudoers

%sysops  ALL=(ALL) NOPASSWD: ALL

No user should use a password at any time.

Setup ACL

Now that is done, the last step is to limit who can access the box based off POSIX groups. Once again, this varies on a box-by-box basis, but its controlled by a single line.

Open up /etc/security/access.conf

This file is a Debian-derivate ACL list controlling who can and can't login. The syntax is explained in the file, but here's an example

-:ALL EXCEPT root (firefighters) (global_services): ALL

In this example, root (local user) can login, as well as those in the firefighter group; all others are denied. You can add multiple groups or users, i.e.

-:ALL EXCEPT root (sysops) (db) (global_services) : ALL

Every machine should allow access to global_services so that service that need to pop into a shell can get in.

Setup SSH

Upgrading SSH

Unfortunately, the version of OpenSSH shipped in precise is too old to support LDAP key retrieval, so we need to upgrade it. I threw together an updated package and loaded it onto a PPA, available here: https://launchpad.net/~li69422-staff/+archive/backports-for-precise

Adding it to the system is quick and painless

root@boron:~# apt-add-repository ppa:li69422-staff/backports-for-precise
You are about to add the following PPA to your system:
 
 More info: https://launchpad.net/~li69422-staff/+archive/backports-for-precise
Press [ENTER] to continue or ctrl-c to cancel adding it

gpg: keyring `/tmp/tmpsvGLrk/secring.gpg' created
gpg: keyring `/tmp/tmpsvGLrk/pubring.gpg' created
gpg: requesting key AEA37004 from hkp server keyserver.ubuntu.com
gpg: /tmp/tmpsvGLrk/trustdb.gpg: trustdb created
gpg: key AEA37004: public key "Launchpad PPA for Packages for li694-22" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
OK

Now we just need to install upgrades, and grab it.

root@boron:~# apt-get update && apt-get dist-upgrade

APT will print the following before upgrading

The following packages will be upgraded:
  openssh-client openssh-server

Note: You can ignore the following error while installing it:

/var/lib/dpkg/info/openssh-server.postinst: 295: /var/lib/dpkg/info/openssh-server.postinst: deb-systemd-helper: not found

Setting Up SSH-LDAP Authetication

So the magic that makes SSH-LDAP authethication work is a command in sshd_config that allows it to dynamically pull authorized_keys from a script. This is a quick and easy two step process

First, copy ldap_ssh.sh to /etc/ssh

root@robot:~# cp ldap_ssh.sh /etc/ssh

Now open /etc/ssh/sshd_config and add the following lines at the bottom

AuthorizedKeysCommand /etc/ssh/ldap_ssh.sh
AuthorizedKeysCommandUser nobody

Restart SSH

root@boron:~# service ssh restart
ssh stop/waiting
ssh start/running, process 5327

And test it by logging in via SSH directly

Locking Down SSH

Once you config that SSH is setup properly, lock down the configuration file. The following should be true

  • SSH should only listen on internal IPs (we can proxy in from boron)
  • Password authetication should be disabled
  • Root login should be disabled

This requires three edits in /etc/ssh/sshd_config, first, the listen lines

ListenAddress INTERNAL-IP

Then Password Authetication

PasswordAuthentication no

And then disable root. You have to add this line as its not in the example config

PermitRootLogin no

Save your changes, and restart ssh

root@boron:~# service ssh restart
ssh stop/waiting
ssh start/running, process 5752

Setting up Kerberos (KRB5)

Setting up single-signon makes life really easy when working across a bunch of nodes, so we have kerberos available facility it. By running kinit, you can generate a kerberos ticket, and zoom around in style.

A quick APT gets us what we need:

root@lithium:~# apt-get install krb5-user
  • Default kerberos realm: LI694-22

To test, switch to your user account, and run kinit:

mcasadevall@lithium:~$ kinit
Password for mcasadevall@LI694-22: 
mcasadevall@lithium:~$ klist
Ticket cache: FILE:/tmp/krb5cc_2500
Default principal: mcasadevall@LI694-22

Valid starting    Expires           Service principal
23/03/2014 08:22  23/03/2014 18:22  krbtgt/LI694-22@LI694-22
	renew until 24/03/2014 08:22

If you got a ticket, you're done with user configuration.

Now you need to generate a ticket as an admin, and generate the necessary server site bits. Due to a bug, you need to edit /etc/krb.conf.

Under realms, add the following:

        LI694-22 = {
                admin_server = kdc-master.li694-22
        }

sudo back to root, and add the principle

mcasadevall@lithium:~# kdestroy # (clean out old tickets)
mcasadevall@lithium:~# kinit krb/admin # (password is in master_passwd on ldap-master)
Password for krb/admin@LI694-22: 
mcasadevall@lithium:~$ kadmin
Authenticating as principal krb/admin@LI694-22 with password.
Password for krb/admin@LI694-22: 
kadmin:

From here, you have to generate a principle for the node, then export it into the keytab, That's done with two commands

kadmin: add_principal -randkey host/carbon.li694-22@LI694-22
WARNING: no policy specified for host/carbon.li694-22@LI694-22; defaulting to no policy
add_principal: Principal or policy already exists while creating "host/carbon.li694-22@LI694-22".
kadmin: ktadd -randkey host/carbon.li694-22@LI694-22
(lots of information)
kadmin: quit

Now its time for round 2 with SSH's config, you need to edit both the client and server settings this time.

Set the following in /etc/ssh/sshd_config
KerberosAuthentication yes
GSSAPIAuthentication yes
GSSAPICleanupCredentials yes

Now you need to edit the outbound config to send Kerberos tickets

Set the following in /etc/ssh/ssh_config (under Hosts *)
GSSAPIDelegateCredentials yes

Confirm you can SSH freely between nodes, and you're done

Install Nagios Plugins

We use a fork of nagios for service monitoring. Setup is done on boron (sentinal.soylentnews.org), but the short version is you just need to install the plugins locally, then setup service monitoring via the icinga config files. This is a one liner.

$ sudo apt-get install nagios-plugins nagios-plugins-extra

Cleanup

Now install pwgen:

apt-get install pwgen

Generate a new root password with pwgen (i.e. pwgen 50 1), and set it

Update the master_password file, and take a breath, you're done!