This repository contains a collection of installation guides for essential tools used throughout the Data Engineering course. Each tool has its own dedicated Markdown file with clear setup steps, configuration notes, and troubleshooting tips.
You will find step-by-step setup instructions for tools such as:
- Hadoop (HDFS & YARN)
- Apache Hive
- Apache Spark
- Airflow
- Kafka
Each guide is located in its own .md file for easy navigation and reference.
This repository is designed to:
- Provide a standardized installation reference for students
- Reduce setup issues during hands-on labs
- Ensure consistency across different environments
- Serve as a quick troubleshooting resource
- Browse to find the relevant installation file
- Follow the steps in your selected tool’s
.mdguide - Use the troubleshooting notes at the bottom of each guide if things break
- Continue to the next guide as required for your module
If you find issues or want to improve a guide, feel free to open a pull request or submit an issue.