Fundamentals of Data Analystics

ENFE600003

Prerequisites

Course Type

Compulsory

Credit Hours

2

Course Description

This course starts with an introduction to the phenomenon of big data. The learning in this course follows the framework of the Data Science process, commencing with a foundational exploration of data collection methods. After this, a thorough examination of data preprocessing techniques is undertaken, encompassing data cleaning and other essential preparatory steps. Moreover, a comprehensive Exploratory Data Analysis (EDA) is conducted, utilizing both graphical and non-graphical approaches. A review of fundamental Statistical concepts is incorporated, laying the groundwork for effective data processing. The course then transitions to an exploration of machine learning methods, with a particular emphasis on supervised learning techniques. Practical applications of machine learning algorithms are demonstrated through the implementation of predictive modeling using regression, specifically Multiple Linear Regression (MLR). To assess the efficacy of the developed models, the model performance is evaluated. The students are required to use appropriate programming languages related to data preprocessing, visualization, and data processing and analysis, i.e. Excel, Tableau dan Python.

Course Learning Outcomes

  • Perform data summarization.
  • Create data visualizations.
  • Analyze data to extract valuable information.

Course Content / Syllabus

  • Introduction to Data Analytics and Visualization

    • Overview of data analytics concepts
    • Fundamentals of data visualization techniques
  • Review of Statistics

    • Key statistical concepts and methods for data analysis
  • Data Acquisition and Collection

    • Methods for gathering and sourcing data
    • Tools and techniques for data collection
  • Data Preprocessing

    • Cleaning and transforming raw data
    • Handling missing values and outliers
  • Exploratory Data Analysis (EDA)

    • Identifying patterns and trends in data
    • Visualization for better data understanding
  • Feature Engineering

    • Creating and selecting features for model building
    • Techniques for enhancing data quality
  • Introduction to Machine Learning

    • Overview of machine learning concepts and workflows
  • Supervised Method: Regression

    • Basics of regression models in supervised learning
  • Model Performance Evaluation

    • Techniques for evaluating and validating model performance

Recommended References

  1. Data Science from Scratch, by Joel Grus Data Science: A First Introduction, https://python.datasciencebook.ca/ Python for Data Analysis, by Wes McKinney, can be accessed from the following link: https://wesmckinney.com/book/ Tableau Elearning: https://elearning.tableau.com/