MST0052
## MST0052 -- Lecture 8 ### Project Showcase Fall 2026 --- ## What's happening today - **Volunteer presenters** show their semester project to the class - **5 minutes** per slot, followed by a short Q&A - I give feedback live; you can chime in - If you're not presenting -- **you are not off the hook.** Listen, take notes on what's working in other projects, ask questions. --- ## Why we do this **Two reasons.** 1. **Feedback in front of an audience** -- you hear what I'd say, and your classmates often catch things I won't 2. **Cross-pollination** -- you see what other students are trying, get ideas, spot common pitfalls in other people's work that you can then check in your own You also get a (low-stakes) rehearsal for the oral exam: explain your choices out loud, take questions. --- ## Format for presenters In your 5 minutes, cover: - **Problem.** What are you predicting? Regression or classification? - **Data.** Source, size, target, key features - **Pipeline so far.** Preprocessing, splits, baseline model - **First results.** One number you trust, on held-out data - **One question.** What you're stuck on, or want a second opinion on No slides required. Show a notebook, a script, or a one-page sketch. --- ## Format for the audience When you listen, listen actively: - Is the **target** clearly defined? - Does the **preprocessing** fit the chosen models? - Is the **validation strategy** honest? (no leakage, no test-set tuning) - Is the **baseline** strong enough that beating it means something? - What is the **one thing** you would change? You can ask one short question per presentation. Be specific. **"Looks good" helps nobody.** --- ## Common pitfalls to listen for - **Leakage** -- preprocessing fitted on the full dataset before splitting - **Wrong metric** -- accuracy on imbalanced classes; R² with no error scale - **No real baseline** -- jumping to a complex model without a linear or majority comparison - **Test set used during development** -- if you've looked at test numbers more than once, it's compromised - **No reproducibility** -- seeds unset, paths hard-coded, environment undocumented If you hear one of these in someone's project, raise it. If it applies to *your own* project, write it down for later. --- ## What if you wanted feedback but didn't sign up? You have three options: - **Talk to me after class** today - **Office hours** -- drop in, no appointment - **Ask for a meeting** if your question is bigger than 10 minutes Same applies any week of the semester -- not just today. --- ## Wrap-up By the end of today, you should leave with: - **Three ideas** from other students' projects worth borrowing - **One pitfall** to audit in your own work this week - **One next-step** for your project (a model to try, a metric to compute, a feature to engineer) --- ## What's next **Lecture 9:** Unsupervised learning and PCA - Dimensionality reduction - When PCA helps and when it doesn't - Scaling matters (again)