docs/infrastructure.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110

Networks
========
 192.168.11.0/24 (18-port IB switch):           Legacy network, non-production systems including storage
 192.168.12.0/24 (12-port IB swotch):		KATRIN Storage network
 192.168.13.0/24 (12-port IB switch):           HPC Cloud & Computing network
 192.168.26.0/24 (Ethernet):                    Infrastructure network (OpenShift nodes and everything else)
 192.168.16.0/22                                External IPs for testing and production
 192.168.111.0/24 (OpenVPN):                    Gateway to Katrin network using Master1 tunnel
 192.168.112.0/24 (OpenVPN):                    Gateway to Katrin network using Master2 tunnel

 192.168.212.0/24
 192.168.213.0/24
 192.168.226.0/24 (Ethernet):                   Staging network (Virtual OpenShift and other nodes)
 192.168.216.0/22                               External IPs for staging 
 192.168.221.0/24 (OpenVPN):                    Gateway to Katrin network using staging Master1 tunnel
 192.168.222.0/24 (OpenVPN):                    Gateway to Katrin network using staging Master2 tunnel
 
KIT resources
=============
 - ipekatrin*.ipe.kit.edu                       Cluster nodes
 - ipekatrin[1:2].ipe.kit.edu                   Master nodes with fixed IPs (one could be dead)
 + katrin[1:2].ipe.kit.edu                      Virtual IPs assigned to master nodes (HA)
 + kaas.kit.edu (katrin.ipe.kit.edu)            DNS-based load balancer between katrin[1:2].ipe.kit.edu
 + *.kaas.kit.edu (*.katrin.ipe.kit.edu)        Default application domain?
 - katrin.kit.edu                               Apache/mod_proxy pod (In DNS put CN to katrin.ipe.kit.edu)
 
 + openshift.ipe.kit.edu                        Gateway (VIPS) to staging cluster (Just one IP migrating between 2 nodes)
 - *.openshift.ipe.kit.edu                      Default application domain for staging cluster

Storage
=======
    LVM VGs
        VolGroup00
            -> LogVol*: System partitions
            -> docker-pool: Docker storage
        Katrin
            -> Heketi PD (we reserve space, but do not configure heketi so far)
                -> vg_*
                    -> Heketi-managed Gluster Volumes
            -> Katrin (mounted at '/mnt/ands')
                -> Space for manually-managed Gluster Bricks
                -> Storage for Galera / Cassandra / etc.?

    Gluster Volume Types:
        tmp:            disitribute ?                   Various data which should be preserved, but not critical if lost or temporarily inaccessible (logs, etc.) [ check if we can still write if one brick is gone ]
        cfg:            replica=3                       Small and critical data sets (configs, sources, etc.)
        cache:          replica+arbiter                 Large re-generatable data which anyway should be always available [ potentially we can use disperse to save space ]
        data:           replica+arbiter                 Very large and critical data
        db:             dispersed                       A few very large files, like large single-table database (ADEI many tables)

        Scalling storage:
            cfg:            3 nodes is enough
            cache/data:     [d][d][a] => [da][d ][ad][ d] => [d ][d ][ d][ d][aa] => further increas in pairs, at some point add second arbiter node
        
    Gluster Volumes:
        provision       cfg     /mnt/provision          Provisioning volume which is not expected to be mounted in the containers (temporarily may contain secret information, etc.)
        openshift       cfg     /mnt/openshift          Multi-purpose: Various small size configurations (adei, apache, etc.)
        temporary       tmp     /mnt/temporary          Multi-purpose: Various logs & temporary files
        ?adei            cfg     /mnt/adei/adei
        adei-db         cache   /mnt/adei/db            
        adei-tmp        tmp     /mnt/adei/tmp
        katrin-mysql    data    /mnt/katrin/mysql       
        katrin-data     cfg     /mnt/katrin/archive     
        katrin-kali     cache   /mnt/katrin/storage     
        katrin-tmp      tmp     /mnt/katrin/workspace   

    OpenShift Volumes:
        etc             cfg/ro          openshift       Various configurations (ADEI & Apache configs, other stuff in etc.)
        src             cfg/ro          openshift       Interpreted source files
        log             tmp/rw          tmp             Suff in /var/log
        tmp             tmp/rw          tmp             Various temporary files
        adei-db         data/rw         adei-db         ADEI cache database and a few primary source [ will take ages to regenerate, so we can't consider it as dispensable cache really ]
        adei-tmp        tmp/rw          adei-tmp        ADEI, Apache, and Cron logs [Techically we have also downloads here which are more cache when tmp... But I think it is fine for now...]
        adei-cfg        cfg/ro          adei?           ADEI & Apache configs
        adei-src        cfg/ro          adei?           ADEI sources
        katrin-mysql    cfg/rw          katrin-mysql    KATRIN Database with configurations, etc.
        katrin-data     data/rw         katrin-data     KATRIN data archives, all primary raw data from Orca, etc.
        katrin-kali     cache/rw        katrin-kali     Generated ROOT files [ Can we make this separation? Marco uses hardlinks ]
        katrin-proc     tmp/rw          katrin-proc     Data processing volume (inbox, etc.)
        
Services
========
 - Keepalived
 - OpenVPN
 - Gluster 
 - MySQL Galera (?)
 - Cassandra (?)
 - oVirt (?)
 - OpenShift Master / Node
    - Heketi
    - Apache Router
    - ADEI Services
    - Apache Spark & etc.

Inventories
===========
 - staging & production will be operating in parallel (staging in vagrant and production on bare-metal)
 - testing is just pre-production tests which will be removed once production is running

Labels
======
 - We specify if node is master and provides fat storage for glusterfs
 - All nodes currently in 'infra' region (for example, student computers will be non-infra nodes; nodes outside of KIT as well)
 - The servers in cellar are in 'default' zone (if we put something in the 4th floor server room, we would define a new zone there)

Computing
=========
 - Define CUDA nodes and OpenCL nodes
 - Intel Xeon Phi is replaced by new Tesla in the ipepdvcompute2
    - Gen1 UFO servers does not support "Above 64G decoding" and can't run Xeon Phi. May be we can put it in new Phi server.