🧵Prerequisites

Software prerequisite

Python3

The only dependency needed in the Installer Machine is Python3 which usually comes preinstalled in recent versions of Linux and can be easily installed in Windows. The current supported Python3 version is 3.10 or higher.

Python3 comes installer with a tool called Pip3 which can be used to install Ansible. The current supported Pip3 version is 22.3 or higher.

Ansible

Ansible is an open-source IT automation tool that can help simplify the process of configuring and managing software, networks, and systems. It automates common tasks like configuration management, application deployment, cloud provisioning, Intra-Service Orchestration, and continuous delivery. Additionally, Ansible’s simple declarative language allows users to describe their IT infrastructure without having to write custom code. Ansible can also be used to automate repetitive tasks and processes, allowing users to save time and resources.

The installer wraps Ansible and runs some playbooks that install the different tools in the Cluster. If you don’t have Ansible installed in your Installer Machine, the installer application can install it for you. The current supported Ansible version is 2.14 or higher.

Cluster Machines Minimum Specs

4 nodes each with:

  • 4 vCPUs

  • 16 GB RAM

  • 60 GB root disk

  • 4 external disks each with 20 GB partitioned as described in storage volumes

  • Red Hat 8.4 or higher

Storage volumes

Once the Cluster machines are ready, special directories must be created on all the machines excepting the main machine (see main_hostname entry in the installer.yaml file). These directories must exist under the root directory as: /data1, /data2, /data3 and /data4. Minio will use these directories to replicate result files from the Spark data processing. We recommend these directories to be special volumes separate from the root volume. Data replication will enhance both resilience and reliability of the system by storing Minio’s data in every host.

LDAP Server

LDAP (Lightweight Directory Access Protocol) is a protocoI used to access and maintain distributed directory service information over a network. An LDAP server is a server that runs the LDAP protocol, which acts like a network directory.

LDAP servers store and help organize users’ directory information, such as names and passwords, which are used to authenticate users to the system.

Andalusia Platform depends on an LDAP server to allow users to log into the different applications such as:

  • Jenkins

  • Jupyterhub

  • Superset

  • etc

Information for LDAP parameters and connection are needed by most Andalusia’s tools. This information must be provided in the input yaml file that the installer binary needs.

Whitelisting Andalusia Artifactory

We use an artifactory server to manage our software artifacts for deployment purposes. The server has to be whitelisted during the installation process

URL: artifactory.andal.us

PORT: TODO

Last updated