Biz Analytics Review

What is BA?

Skills, Tech & Practices for Continuous Iterative Exploration & Investigation to gain insights and drive biz planning

Data → Insights → Biz Planning
Types
- Descriptive
  - Insights about past
  - Summarise raw data to make interpretable
- Predictive
  - Understand the future
  - Provide actionable insights based on past data
- Prescriptive
  - Advise possible outcomes
  - Attempt to quantify effect of future decisions on outcomes

Descriptive Biz Analytics

Randomness, population and samples
Data - Categorical or Numerical
Sample - Random, Systematic, Stratified or Clustered
Stats
- Central Tendency: Mean, Mode, Median, Midrange, Quartiles
- Dispersion: Range, Var, Std Dev, Interquart Range, Coeff of Var
- Shape: Skew or Dispersion
- Exploratory: Five number summary, box-and-whiskers
Data Visualisation
- Presentation of data visually to amplify cognition
- Tool: Tableau
- Designing Data Viz Products: Design → Paper&Pencil → Execution
- Extra
  - Trend Lines, Forecasting, Clustering, What-If Analysis

Predictive Biz Analytics / Machine Learning

ML is subset of AI
What Is:
- Transforming data into knowledge to produce actionable insights
- Gives computers ability to learn without being explicitly programmed
ML Lifecycle
1. Define Goals
  1. Specify Biz Prob
  2. Define unit of analysis, prediction target
  3. Prioritise model criteria
2. Data Prep
  1. Find appropriate data
  2. Merge data in a single table
  3. Explore data
  4. Clean data
  5. Feature engineering
3. Create Model
4. Interpret Model
5. Implement Model
Model Types (step 3-5)
- Supervised
  - Numeric Prediction → Regression
    - Regression Types
      1. Linear Regression
        
        Univariate Linear Reg
        
        Multivariate Linear Reg
        
        Non-Linear Reg
      2. Decision Tree
        
        Built top down from root (contains all instances) to leaf (prediction with smaller SD)
        
        Entropy = homogeneity of data, Std Dev (0 = homo)
        
        Pros
        
        Both Reg and Classification
        
        Easy to Interpret
        
        Gens Biz Knowledge
        
        Cons
        
        Prone to overfitting
        
        Too sensitive to instances, does not generalise well
      3. Random Forest
        
        Set of Decision trees
        
        Split training dataset to train diff models, combine at end
      4. Deep Learning
        
        Based on large neural networks
        
        Learn by example (training set shows examples, connections are made automatically)
      5. Ensembles
        
        Superset of random forest
        
        Set of any number and type of models
    - Evaluation
      - Mean Absolute Error
      - Mean Squared Error
      - R-Squared
  - Categorical Prediction → Classification
    - Models
      1. Logistic Regression
        
        Instead of fit categorical var, we predict probabilities of each category
      2. Decision Tree
        
        See above (Regression>DecisionTree)
      3. Random Forest
        
        See above (Regression>RandomForest)
      4. Deep Learning
      5. Ensembles
    - Evaluation
- Unsupervised