๐Ÿ”งConfiguration

Input yaml file

Here is an example of the input yaml file the installer needs:

version: 0.0.0

ssh:
    user: ec2-user
    private_key_file: /home/admin/andalusia-installer.pem

main_hostname: my_main_host
spark_hostname: the_only_spark_master_host

ips:
    my_main_host:
        ansible_used_ssh_ip: 34.206.187.214
        ansible_used_ssh_port: 22
        private_ip: 172.31.26.187
    the_only_spark_master_host:
        ansible_used_ssh_ip: 34.201.138.164
        ansible_used_ssh_port: 22
        private_ip: 172.31.30.153
    some_hostname:
        ansible_used_ssh_ip: 34.203.197.117
        ansible_used_ssh_port: 22
        private_ip: 172.31.26.105
    another_hostname:
        ansible_used_ssh_ip: 52.201.231.74
        ansible_used_ssh_port: 22
        private_ip: 172.31.29.31

nexus:
    user: admin
    password: "*******"
    endpoint: artifactory.andal.us

ldap:
  client_user: admin
  client_password: "*******"
  host: 172.31.15.134
  port: 389
  search: ou=people,dc=andal,dc=ec2,dc=internal
  root: dc=andal,dc=ec2,dc=internal
  user_search_base: ou=people
  uid_field: uid
  use_tls: False
  user_firstname_field: cn
  user_lastname_field: sn
  user_mail_field: mail
  user_group_membership_field: memberOf

Letโ€™s explain every field:

  • version: Version of the input yaml file. Used if definitions of the yaml file change.

  • ssh:

    • user: SSH user that should exist in all Cluster machines

    • private_key: PEM formatted SSH private key that allows the user to login remotely to every cluster machine

  • main_hostname: This must match one of the hostnames machines in your list of IPs. It identifies the machine where most of the tools will be installed such as: Jenkins, Jupyterhub, Superset, Spark history, and others.

  • spark_hostname: This must match one of the hostnames machines in your list of IPs. It identifies the machine where the Spark master process is installed.

  • ips: list of hostnames you want to use for each machine in your cluster. Every machines has 3 attributes:

    • ansible_used_ssh_ip: IP used by Ansible to login via SSH

    • private_ip: IP used by Ansible to configure other machines so that they can interact one with another. It can be the same ansible_used_ssh_ip

  • nexus

    • user: an existing user in the Nexus server

    • password: the password of the Nexus user

    • endpoint: the url or IP of the Nexus server

  • ldap:

    • client_user: existing user in LDAP used to bind to. This user must be able to run querys in the LDAP server.

    • client_password: password of the previous user. If special characters are used make sure to put the password in double quotes.

    • host: IP of the LDAP server

    • port: port of the LDAP server

    • search: LDAPโ€™s DN where people or users are found

    • root: LDAPโ€™s root DN

    • user_search_base: LDAPโ€™s left part of the DN where people or users are found

    • uid_field: uid field name

    • use_tls: specifies if the LDAP server uses TLS or not

    • user_firstname_field: firstname field name

    • user_lastname_field: lastname field name

    • user_mail_field: mail field name

    • user_group_membership_field: user group member field name

Last updated