Machine Learning (Software Automation)

Opinion

This unit is called Software Automation in the syllabus, but that is not accurate. The syllabus is heavily focused on Machine Learning (ML), and to be called Software automation, it should also cover other automation activities. This would include DevOps and DevSecOps, the practice of creating software applications to reduce human intervention in time-consuming IT tasks such as cloud operations, deployments, and software and system orchestration.

Info

The NSW syllabus document has a gap in that it specifies what to teach (algorithms, concepts, impacts) but not the foundational skills needed to actually do ML work. Sections have been added to address this gap.

Missing from Syllabus but Essential:

Data handling - Can't do ML without loading and exploring data
Model evaluation - Can't know if models work without metrics
Train/test methodology - Can't properly assess models without this
Preprocessing - Real data needs cleaning and preparation
Implementation environment - What tools/libraries to use

Note

This is work in progress. The contents on this page may change as the course details are created.

Introduction to Machine Learning: Concepts, Implementation and Impact

This unit has been design to be completed in approximately 8 weeks with 6-8 hours of study per week.

Week 1: Foundations of AI, ML & Automation

Syllabus Coverage

✓ Distinguish between artificial intelligence (AI) and ML
✓ Investigate how machine learning (ML) supports automation through the use of DevOps, robotic process automation (RPA) and business process automation (BPA)

Learning Objectives

Distinguish between artificial intelligence (AI) and machine learning (ML)
Investigate how ML supports automation through DevOps, RPA, and BPA
Understand the context and applications of ML in modern software systems

Content Overview

Conceptual Foundations:

Definitions and clear distinctions between AI and ML with real-world examples
Exploration of automation technologies:
- DevOps (Development and Operations integration)
- Robotic Process Automation (RPA)
- Business Process Automation (BPA)
How ML enables and enhances automation processes
Industry case studies demonstrating ML-powered automation

Week 2: ML Training Models & Data Fundamentals

Syllabus Coverage

✓ Explore models of training ML including: supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning
✓ Investigate common applications of key ML algorithms including: data analysis and forecasting, virtual personal assistants, image recognition

Learning Objectives

Explore four models of ML training: supervised, unsupervised, semi-supervised, and reinforcement learning
Understand common ML applications: data analysis/forecasting, virtual assistants, image recognition
Develop foundational data handling skills essential for ML
Learn to load, explore, and visualise datasets

Content Overview

ML Training Models:

Supervised Learning: Learning from labelled data (spam filters, house price prediction)
Unsupervised Learning: Finding patterns in unlabelled data (customer segmentation, clustering)
Semi-Supervised Learning: Combining labelled and unlabelled data
Reinforcement Learning: Learning through reward/punishment (overview level - game AI, robotics context)

Common ML Applications:

Data analysis and forecasting (trend prediction, time series)
Virtual personal assistants (Siri, Alexa, chatbots)
Image recognition (facial recognition, object detection, medical imaging)

Essential: Data Fundamentals

Understanding datasets: rows (samples), columns (features), target variables
Loading data from CSV files using pandas
Basic data exploration: shape, info, describe, head/tail
Data visualisation basics: scatter plots, histograms, bar charts using plotly
Understanding what makes "good" data for ML

Week 3: Classical ML Algorithms - Concepts & Simple Implementation

Syllabus Coverage

✓ Research models used by software engineers to design and analyse ML including: decision trees, neural networks
✓ Describe types of algorithms associated with ML including: linear regression, logistic regression, K-nearest neighbour

Learning Objectives

Understand decision trees and how they make decisions
Understand K-nearest neighbour (K-NN) algorithm
Introduce linear and logistic regression concepts
Introduce neural networks conceptually (implementation in Week 6)
Implement simple decision tree and K-NN models
Learn train/test split and basic model evaluation

Content Overview

Design & Analysis Models:

Decision Trees:

Visual tree structure and decision-making process
How trees split data based on features
Advantages: interpretable, handles non-linear relationships
Simple implementation using scikit-learn

Neural Networks (Conceptual Introduction):

Simplified explanation: input layer, hidden layers, output layer
How information flows through networks
When neural networks are useful (complex patterns, large data)
Note: Implementation deferred to Week 6

Algorithm Types:

K-Nearest Neighbour (K-NN):

How K-NN classifies based on "nearest" data points
Distance calculations (Euclidean distance simplified)
Choosing K value
Simple implementation using scikit-learn

Linear Regression (Introduction):

Fitting a line through data points
Understanding slope and intercept
Predicting numeric values
Full implementation in Week 4

Logistic Regression (Introduction):

Binary classification (yes/no, true/false)
Predicting probabilities
When to use vs. linear regression
Full implementation in Week 4

Essential: Model Evaluation Fundamentals

Train/test split concept: why we need separate data for testing
Accuracy as a basic metric
Understanding model performance
Overfitting vs underfitting (simplified introduction)

Week 4: Regression Models - Linear & Polynomial Implementation

Syllabus Coverage

✓ Design, develop and apply ML regression models using an OOP to predict numeric values including: linear regression, polynomial regression
✓ Describe types of algorithms associated with ML including: linear regression (implementation)

Learning Objectives

Understand OOP principles in the context of ML implementation
Design and develop linear regression models using OOP
Design and develop polynomial regression models using OOP
Apply regression models to predict numeric values
Evaluate regression model performance using appropriate metrics
Understand overfitting and underfitting in regression context

Content Overview

OOP for Machine Learning:

Review of OOP fundamentals: classes, objects, methods, attributes
Why OOP matters for ML: modularity, reusability, organisation
Structure of scikit-learn's OOP approach (fit, predict, score methods)
Reading and understanding ML code structure

Linear Regression:

Mathematical concept (simplified): finding the best-fit line
Relationship between features (X) and target (y)
Implementation using scikit-learn's LinearRegression class
Making predictions with trained models
Visualising regression lines with scatter plots

Polynomial Regression:

When linear relationships aren't enough (curved patterns)
Understanding polynomial features (x, x², x³)
Implementation using PolynomialFeatures + LinearRegression
Comparing linear vs. polynomial fit
Danger of overfitting with high-degree polynomials

Essential: Model Evaluation for Regression

Mean Squared Error (MSE): Understanding prediction errors
R² Score: How well does the model explain the data?
Visualising residuals (prediction errors)
Overfitting: Model too complex, fits training data perfectly but fails on new data
Underfitting: Model too simple, doesn't capture patterns
Using train/test split to detect overfitting

Week 5: Logistic Regression & Model Comparison

Syllabus Coverage

✓ Design, develop and apply ML regression models using an OOP to predict numeric values including: logistic regression
✓ Describe types of algorithms associated with ML including: logistic regression (implementation)

Learning Objectives

Design and develop logistic regression models using OOP for classification
Understand binary classification and probability predictions
Evaluate classification models using appropriate metrics
Compare multiple algorithms on the same problem
Select appropriate algorithms based on problem characteristics
Strengthen feature engineering and data preparation skills

Content Overview

Logistic Regression:

Binary classification explained (yes/no, spam/not spam, pass/fail)
How logistic regression predicts probabilities (0 to 1)
Decision threshold (typically 0.5)
Implementation using scikit-learn's LogisticRegression class
Key difference from linear regression: predicting categories, not numbers

Classification Evaluation Metrics (Critical Addition)

Accuracy: Percentage of correct predictions
Confusion Matrix: True positives, false positives, true negatives, false negatives
Precision and Recall: Understanding trade-offs
When accuracy can be misleading (imbalanced datasets)
Visualising classification results

Model Comparison Framework:

Comparing decision trees, K-NN, and logistic regression on same problem
Performance metrics comparison table
Understanding algorithm strengths and weaknesses
When to choose which algorithm
Computational cost considerations (speed vs. accuracy)

Essential: Feature Engineering Basics

What are features and why they matter
Handling categorical variables (one-hot encoding simplified)
Feature scaling/normalisation (when and why)
Selecting relevant features for your model
Impact of feature choice on model performance

Week 6: Neural Networks - Simplified Introduction & Advanced Classical ML

Syllabus Coverage

✓ Apply neural network models using an OOP to make predictions
✓ Research models used by software engineers to design and analyse ML including: neural networks (implementation)

Learning Objectives

Understand neural network architecture at implementation level
Apply simple neural networks using high-level OOP frameworks
Make predictions using trained neural networks
Compare neural network performance with classical ML algorithms
Deepen understanding of when to use neural networks vs. classical algorithms
Alternative focus: Advanced applications of classical ML algorithms

Content Overview

Pedagogical Note

Given the remote learning context and lack of ML expertise, this week offers TWO PATHWAYS:

Pathway A (Recommended): Simplified neural networks using very high-level tools

Pathway B (Alternative): Advanced classical ML techniques

Teachers/course designers should choose based on student readiness and available support.

PATHWAY A: Simplified Neural Networks (Observation & High-Level Use)

Neural Network Architecture:

Review of structure: input layer, hidden layers, output layer
Neurons, weights, and activation functions (simplified explanation)
How networks learn: concept of training, loss, and optimisation (non-mathematical)
When neural networks excel: complex patterns, large datasets, image/text data

High-Level Implementation Approach:

Using Keras Sequential API (highest abstraction level)
Pre-built architectures for common problems
Transfer learning concept: using pre-trained networks
Focus on using neural networks rather than building from scratch

Simplified Hands-On Work:

Option 1: Keras Sequential for Simple Problems

Very simple fully-connected networks (2-3 layers maximum)
Binary or multi-class classification on tabular data
Heavy scaffolding: modify existing code rather than write from scratch
Emphasis on observing behaviour and comparing with classical ML

Option 2: Transfer Learning (Even Higher Level)

Use pre-trained models (e.g., image classification with MobileNet)
Focus on loading models and making predictions
No training required, just inference
Demonstrates power of neural networks without complexity

Option 3: Google Teachable Machine

Web-based, no-code neural network training
Image, sound, or pose classification
Export model and use in Python
Most accessible option for remote learners

Comparison Framework:

Neural network vs. logistic regression on same tabular data
When is the extra complexity of neural networks worth it?
Computational cost, interpretability trade-offs
Performance gains vs. ease of debugging

PATHWAY B: Advanced Classical ML (Alternative If Neural Networks Too Complex)

Focus: Deepen mastery of accessible, interpretable algorithms

Advanced Decision Trees:

Ensemble methods: Random Forests (conceptual + implementation)
How multiple trees make better predictions
Feature importance from tree ensembles
Implementation using RandomForestClassifier

Advanced K-NN and Other Algorithms:

K-NN for regression (not just classification)
Distance metric selection
Introduction to Support Vector Machines (SVM) - basic concept
Naive Bayes for text classification (simple example)

Model Selection and Validation:

Cross-validation: better than single train/test split
Grid search for hyperparameter tuning (simplified)
Selecting best model systematically
Validation curves to understand model behaviour

End-to-End ML Pipeline:

Complete workflow: data loading → preprocessing → training → evaluation → prediction
Putting it all together in organised code
Creating reusable functions for ML tasks
Documentation and code organisation

Week 7: Human Factors, Behaviour Patterns & Bias in ML/AI

Syllabus Coverage

✓ Explore by implementation how patterns in human behaviour influence ML and AI software development including: psychological responses, patterns related to acute stress response, cultural protocols, belief systems
✓ Investigate the effect of human and dataset source bias in the development of ML and AI solutions

Learning Objectives

Explore how patterns in human behaviour influence ML and AI development
Understand psychological responses, stress patterns, cultural protocols, and belief systems in ML context
Investigate human bias and dataset source bias in ML systems
Identify bias in datasets and model outputs through practical investigation
Analyse real-world case studies of bias in ML/AI
Develop strategies for bias detection and mitigation

Content Overview

Patterns in Human Behaviour Influencing ML/AI:

Psychological Responses:

How human perception and cognition affect ML design choices
Confirmation bias in data collection and interpretation
Anchoring effects in model evaluation
User trust and acceptance of AI recommendations
Example: Health apps and stress detection - how psychological factors affect data

Patterns Related to Acute Stress Response:

Fight-or-flight responses and their digital traces
How ML systems detect and respond to stress patterns
Ethical considerations in stress monitoring
Example: Social media algorithms detecting mental health signals
Privacy and consent issues

Cultural Protocols:

How cultural context shapes data and model design
Language and communication patterns across cultures
Different cultural norms in privacy, consent, and data sharing
Example: Facial recognition trained primarily on Western faces
Importance of diverse training data

Belief Systems:

How developer beliefs and values embed in ML systems
Societal assumptions reflected in algorithms
Religious and ethical considerations in AI design
Example: Recommendation algorithms reinforcing existing beliefs (filter bubbles)
Responsibility of developers to recognise their biases

Human and Dataset Source Bias:

Types of Bias:

Human Bias: Developer assumptions, sampling bias, labeling bias
Dataset Bias: Historical bias, representation bias, measurement bias
Algorithmic Bias: How ML amplifies existing biases in data
Interaction Between Biases: How they compound

Real-World Case Studies:

Facial Recognition: Lower accuracy for people of colour (Gender Shades study)
Hiring Algorithms: Amazon's resume screening bias against women
Criminal Justice: COMPAS recidivism prediction bias
Language Models: Gender and racial stereotypes in text generation
Medical AI: Diagnostic systems trained primarily on one demographic

Practical Investigation of Bias:

Hands-on experiment: Train model on biased dataset
Compare with model trained on balanced dataset
Observe differences in predictions
Quantify bias in model outputs
Document findings and implications

Week 8: Impact Assessment & Synthesis

Syllabus Coverage

✓ Assess the impact of automation on the individual, society and the environment including: safety of workers, people with disability, the nature and skills required for employment, production efficiency, waste and the environment, the economy and distribution of wealth

Learning Objectives

Assess the impact of automation on individuals, society, and environment across multiple dimensions
Analyse safety implications for workers
Evaluate accessibility and inclusion for people with disabilities
Examine changing nature and skills required for employment
Assess production efficiency, waste, and environmental impact
Evaluate economic effects and wealth distribution
Synthesise all learning from Weeks 1-7 into comprehensive understanding
Reflect on ethical responsibilities of ML/AI developers

Content Overview

Impact of Automation on Multiple Dimensions:

Safety of Workers:

How automation changes workplace hazards
Robots and ML in dangerous environments (manufacturing, mining, inspection)
New safety considerations (human-robot interaction)
Deskilling and loss of safety awareness
Monitoring and surveillance concerns
Case studies: warehouse automation, autonomous vehicles in industry

People with Disability:

Assistive technologies powered by ML (speech recognition, image description, predictive text)
Accessibility improvements through automation
Potential barriers: cost, complexity, design assumptions
Risk of exclusion if ML not trained on diverse populations
Case studies: accessible navigation, communication aids, adaptive interfaces

Nature and Skills Required for Employment:

Job displacement vs. job transformation
New skills needed in automated workplaces
Shift from manual to cognitive and creative skills
Lifelong learning and reskilling needs
Digital divide and access to retraining
Case studies: manufacturing evolution, customer service automation, creative industries

Production Efficiency, Waste, and Environment:

Efficiency gains: faster production, reduced errors, optimised resource use
Environmental costs: energy consumption of ML models (especially large neural networks), e-waste from automation hardware
Environmental benefits: optimised logistics, precision agriculture, smart energy grids
Sustainability considerations in ML development
Case studies: supply chain optimisation, environmental monitoring

Economy and Distribution of Wealth:

Economic productivity gains from automation
Wage effects: which jobs grow, which decline
Wealth concentration: who owns the automation technology
Geographic disparities: impacts on different communities
Policy considerations: taxation, universal basic income debates
Case studies: economic impact studies, regional disparities

Synthesis of All Learning:

Connecting technical skills (Weeks 1-6) with ethical awareness (Week 7) and societal impact (Week 8)
ML/AI developer's responsibility
Informed decision-making about ML applications
Critical evaluation of ML systems