๐Welcome to Andalusia
Last updated
Last updated
Andalusia offers robust on-premise data science platform, designed to operate effectively in restrictive network environments, providing a wide array of tools for data scientists and business intelligence use-cases. Let's highlight some of the most crucial features we think data scientists would love
Jupyterhub: This is a multi-user version of the Jupyter notebook which is an interactive computing environment that allows users to create and share documents that contain live code, equations, visualizations, and narrative text. Andalusia uses Jupyterhub to provide a collaborative environment for data scientists.
Superset: Apache Superset is a modern, enterprise-ready business intelligence (BI) web application. It allows you to explore, visualize, and share data easily. The platform has the capacity to perform BI tasks, dashboards creation, data exploration and visualization.
Data Integrations: Andalusia is able to connect and fetch data from various sources, such as databases, APIs, or flat files, which is essential for any data science platform.
Distributed Spark Cluster: Apache Spark is an open-source distributed general-purpose cluster-computing framework. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance. Its use in Andalusia is capable to handle large-scale data processing tasks.
S3 Storage System: Andalusia uses an S3 storage system, similar to cloud-based object storage system. This would allow it to store large amounts of data efficiently with high availability.
Scalability: The ability to scale up or down is crucial for any modern data science platform. This could mean adding more nodes to the Spark cluster or increasing storage capacity as demand grows.
We've put together some helpful guides for you to get setup with our product quickly and easily.