Class is a combination of lecture time, and group work/coding time.
Bring
computer
every
day
Jupyter
notebooks
Python
Once a week, there will be a short check-in quiz. This will be basic content realted to lectures
since the last class. Possible questions include checking on definitions, or basic understanding of
major ideas.
10
points
per
quiz
Drop
two
lowest
grades
Class
Structure
Pt
2
Homeworks due once a week, midnight of the day marked in the schedule (mostly Sundays).
20
points
per
homework
Drop
two
lowest
grades
Sliding scale:
24
hours
late:
5%
penalty.
48
hours
late:
15%
penalty.
Three Midterms
See
schedule
for
dates
100
points
each
Not
cumulative
One Project
Analyze
dataset
using
tools
in
class,
submit
written
report
100
points
Due
at
the
end
of
the
semester
Basic
Expectations
attend
each
class
for
the
full
70
min
duration
take
detailed
notes
on,
or
beside,
the
skeleton
slides
provided.
complete
the
jupyter
notebook
in
class.
read
the
assigned
textbook
chapters
listed
in
the
course
schedule
(on
course
website).
actively
participate
in
group
work
and
interactive
Q&
sessions.
complete
all
homework
assignments,
quizzes,
exams,
and
a
semester
project.
Emphasizes
models
and
their
interpretability,
precision,
and
uncertainty
Machine Learning
Machine
learning
has
a
greater
emphasis
on
large
scale
applications
and
prediction
accuracy.
Nowadays....to sound pedantic or techie?
Why
should
you
care?
Data is everywhere, getting more
complicated and useful. Learning how to
analyze data is critical.
Web
data,
e-commerce
(Amazon,
JD,
Alibaba)
Car
sales
(Tesla,
Ford,
and
GM)
Sports
team
(MSU,
Lions,
etc)
Politics
and
government
Image,
videos,
text
even
fancier
data
in
biomedicine
Learning
Tools
as
Black
Boxes?
Or
Math
Apocalypse?
Need to understand the machinery enough to
know
what
tool
to
use
know
how
to
interpret
output
of
the
tool
Don’t need to rebuild the entire box from scratch
Example:
Email
spam
Supervised
learning
Outcome
measurement
(also
called
dependent
variable,
response,
target,
label).
Vector
of
predictor
measurements
(also
called
inputs,
regressors,
covariates,
features,
independent
variables).
In
the
regression
problem,
is
quantitative
(e.g
price,
blood
pressure).
In
the
classification
problem,
takes
values
in
a
set
of
distinct
categories
(survived/died,
cancer
class
of
tissue
sample,
types
of
language).
Unsupervised
learning
No
outcome
variable,
just
a
set
of
predictors
(features)
measured
on
a
set
of
samples.
Objective
is
fuzzier:
often
explore
the
intrinsic
relation
between
samples
(e.g.,clustering)
or
features
(e.g.
dimensionality
reduction)
Difficult
to
know
how
well
you
are
are
doing
Different
from
supervised
learning
but
can
be
useful
as
a
pre-processing
step
for
supervised
learning.
Generative
AI
discussion
Definition
via
Wikipedia: Generative artificial intelligence (AI)
is artificial intelligence capable of
generating text, images, or other
media,
using generative models. Generative
AI models learn the patterns and
structure of their input training data
and then generate new data that has
similar characteristics.
Examples: