Difference between revisions of "SystemAdministration"

From SoylentNews
Jump to: navigation, search
(Who we are)
(Servers)
Line 47: Line 47:
 
:: [[Lithium]]
 
:: [[Lithium]]
 
* staff-slash -- Staff only Slash server.
 
* staff-slash -- Staff only Slash server.
:: [[Nitrogen]]
+
:: [[Boron]]
 
* irc -- IRC server and related services.
 
* irc -- IRC server and related services.
:: [[Carbon]]
+
:: [[Beryllium]]
 
* [[SystemAdministration/Backups|backups]] -- Backup services.
 
* [[SystemAdministration/Backups|backups]] -- Backup services.
 
:: [[Oxygen]]
 
:: [[Oxygen]]

Revision as of 04:30, 5 April 2015

TeamPages - parent, Development

Welcome

This is a comphensive index dealing with aspects of system administration and management of our clusters, as well as some of the more archine bits of setup required to make it work.

Who we are

Sysop Team Main Page

nick

position

timezone

paulej72 Co-leader UTC-4 (EDT)
mechanicjay Co-leader UTC-4 (EST/EDT)
NCommander Member UTC-9 (AKDT)
Audioguy Member UTC-7 (PST/PDT)


Index of Development Pages and Resources

Servers

List of servers on linode: Category:SystemAdministration/Servers

  • soylent-www - Primary Apache and slash servers for main site.
Hydrogen, Fluorine, Boron
  • soylent-db -- mysql servers, holds the slash database.
Helium, Neon
  • dev -- Development server.
Lithium
  • staff-slash -- Staff only Slash server.
Boron
  • irc -- IRC server and related services.
Beryllium
Oxygen
  • directory services -- LDAP and Kerberos.
Helium, Boron
Beryllium

Known Problems

  • Need cron job to backup server
  • No init script for Apache.
  • Broken https configuration
    • Mostly fixed, Slash is the problem child now
  • Gluster is occassionally misfiring, manifests as Apache or slashd crashing depending on the node, can be fixed with the following command cocktail
sudo umount -l /srv/soylentnews.org # tells linux to lazy unmount, required when glusterd took a dive
sudo service glusterfs-server restart
sudo mount -a # will remount gluster without an issue

Then restart Apache/Slashd as required

Stuff That Needs To Be Addressed

  • Hydrogen is off line due to performance problems
  • Gluster is unstable on Fluorine and Boron and sometimes Hydrogen
  • icinga/monitoring project needs to be picked up and completed
    • Landscape - I appreciate NCommanders ability to obtain a product normally sold for free, but if this is not being used it should not be running and using resources. (helium, perhaps others)
  • Should have some sort of SN password safe
  • Privilege Duplication - making sure that all services have multiple admins
  • DNS, Audioguy is investigating some goofiness
  • Systems Documentation needs to revised and brought up to date.
  • Work Coordination, not always good communication when fundamental things change.
  • There is no firewall coding at all. Something I normally set up before even one network cable is plugged in. I understand the desire to have an open system on public interfaces, but I see no reason that systems not publicly accessible such as database backends should not be firewalled off from Chinese and other such hackers. Such as the attempts being made on Helium from 120.192.20.162 120.192.0.0/11 China Mobile communications corporation at the moment I am writing this.


Work Notes

DNS is completely run and managed by Linode's DNS Manager service. This was an expedient decision when trying to get off bluehost. We may want to investigate putting the master zone file on helium or boron and having external services handle serving out our dns.

Resources