Machine Learning for Big Data Analytics - F24

DAT 301
Closed
McMaster University Continuing Education
Hamilton, Ontario, Canada
GD
Instructor
(13)
6
Timeline
  • September 17, 2024
    Experience start
  • September 24, 2024
    Project Scope Meeting (TBD)
  • October 26, 2024
    Midway Check-in (TBD)
  • December 3, 2024
    Final Presentation (TBD)
  • December 7, 2024
    Experience end
General
  • Certificate
  • 20 learners; teams of 4
  • 40 hours per learner
  • Dates set by experience
  • Learners self-assign
Preferred companies
  • 3/3 project matches
  • Anywhere
  • Academic experience
  • Any company type
  • Any industries
Categories
Machine learning Data visualization Data analysis Data modelling Data science
Skills
modern machine learning techniques neural network deep learning reinforcement learning nlp text analysis big data analytics
Project timeline
  • September 17, 2024
    Experience start
  • September 24, 2024
    Project Scope Meeting (TBD)
  • October 26, 2024
    Midway Check-in (TBD)
  • December 3, 2024
    Final Presentation (TBD)
  • December 7, 2024
    Experience end
Overview
Learner goals and capabilities

This course is part of the Big Data Programming and Analytics certificate programs.

Students in the program are adult learners with a post-secondary degree/diploma in

computer science, engineering, business, etc.


This course builds on the fundamental principles of data analytics, this course advances

to modern machine learning techniques such as neural network, deep learning, and

reinforcement learning as well as NLP and text analysis. Application activities are

structured to provide an introductory level of how machine learning techniques are

applied to big data analytics.

Expected outcomes and deliverables

The final project deliverables will include:


  • A report on students’ findings and details of the problem presented
  • Future collaboration ideas will be identified based on current project outcomes
Project Examples

The projects will provide an opportunity for businesses and learners to collaborate to

identify and address real business challenges.


The projects, which can be short, will allow the student to apply the data management

concepts and techniques presented in the classes to address the sponsors business

challenges. Some examples are:


  • Identify and use various “big data analytics” tools, algorithms, and terminologies
  • Apply text analytics, sentiment analysis and NLP
  • Identify and apply machine learning algorithms and how to “scale” those to big
  • data: trees with ensemble methods, neural networks
  • Asist the sponsor in determining if the organization should invest in “big data
  • analytics” technologies


You should submit a high-level proposal/business problem statement including

relevant data sets and definitions, a list of acceptable tools (if applicable), and

expected deliverables. Business datasets could be provided based on a non-

disclosure agreement or in an anonymized/synthetic data format that is relevant to

your organization and business problem. The course instructors will review the

documents to confirm the scope and timing of the proposed problem and its

alignment with the capstone course requirements.


Analytics solution may be applicable for (however they are not limited to) the following

topics:


1. Demand for social services (healthcare, emergency services, infrastructure, etc.)

2. Customer acquisition and retention

3. Merchandising for trade areas (categories)

4. Quantifying Customer Lifetime Value

5. Determining media consumption (mass vs digital)

6. Cross-sell and upsell opportunities

7. Develop high propensity target markets

8. Customer segmentation (behavioral or transactional)

9. New Product/Product line development

10. Market Basket Analysis to understand which items are often purchased together

11. Ranking markets by potential revenue

12. Consumer personification


To ensure students’ learning objectives are achieved, we recommend that the datasets

are at least 20,000+ rows in size. Data need to be ‘clean’. If more than one database is

provided, which must be conjoined, students will be required to integrate them. This

supports the learning experience and minimizes partner data preparation.

Additional company criteria

Companies must answer the following questions to submit a match request to this experience:

We recommend that your datasets are at least 20,000+ rows in size. Do you confirm?

Is the data "clean"?

If more than one database is provided, which must be conjoined, students will be required to integrate them. Do you agree with it?