Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lesson 5 #781

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,17 @@

dist

#pythion virtual envs
venv*

# User-specific files
*.rsuser
*.suo
*.user
*.userosscache
*.sln.docstates
#draw.io backups
*.bkp

# User-specific files (MonoDevelop/Xamarin Studio)
*.userprefs
Expand Down
38 changes: 38 additions & 0 deletions 1-Introduction/1-intro-to-ML/DanNotes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
How are things learned?
Memorization
Accumulation of facts
Limited by:
Time to observe facts
Memory to observe facts
----------
This is "declarative knowledge" - based on statements of truth
----------
Generalization
Deduce new facts from old facts
Limited by:
Accuracy of the dedeuction process
Essentially a predictive activity
Assumes that the past predicts the future.
----------
This is "imperative knowledge"
----------


Basic paradigm:
- provide a set of - seen, observed - training data
- decide on a characteristic of that training data as representative for the issue
- infer something (a rule?) about the process that has generated that data
- use inference to make predictions about previously unseen data
- confirm inference using a set of test data

A choice might have to be made between "Will I have false negatives or false positives allowed by my rules" and it would depend on what side is the risk higher.

Issues of concern when learning models:
Leaned models will depend on :
- distance metric between examples
- choice of features vectors
- constraints of complexity model
- specified or unknown number of clusters
- complexity of separating surface
- need to acoid overfitting problems like "each example is its own cluster"

25 changes: 25 additions & 0 deletions 1-Introduction/1-intro-to-ML/Relationship-AI-DataScience.drawio
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
<mxfile host="Electron" agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/24.7.17 Chrome/128.0.6613.36 Electron/32.0.1 Safari/537.36" version="24.7.17">
<diagram name="Page-1" id="MCpIz8lRCv8Zt3A3B-yo">
<mxGraphModel dx="1098" dy="988" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">
<root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="mHofzHGclkb5rKL9wMxo-1" value="AI" style="ellipse;whiteSpace=wrap;html=1;fillColor=#f5f5f5;fontColor=#333333;strokeColor=#666666;verticalAlign=top;" parent="1" vertex="1">
<mxGeometry x="160" y="240" width="510" height="370" as="geometry" />
</mxCell>
<mxCell id="mHofzHGclkb5rKL9wMxo-2" value="ML" style="ellipse;whiteSpace=wrap;html=1;fillColor=#dae8fc;strokeColor=#6c8ebf;verticalAlign=top;" parent="1" vertex="1">
<mxGeometry x="240" y="280" width="390" height="310" as="geometry" />
</mxCell>
<mxCell id="mHofzHGclkb5rKL9wMxo-3" value="Deep Learning" style="ellipse;whiteSpace=wrap;html=1;fillColor=#d5e8d4;strokeColor=#82b366;verticalAlign=top;" parent="1" vertex="1">
<mxGeometry x="290" y="310" width="330" height="260" as="geometry" />
</mxCell>
<mxCell id="mHofzHGclkb5rKL9wMxo-4" value="Neuronal Networks" style="ellipse;whiteSpace=wrap;html=1;fillColor=#f8cecc;strokeColor=#b85450;verticalAlign=top;" parent="1" vertex="1">
<mxGeometry x="340" y="340" width="270" height="220" as="geometry" />
</mxCell>
<mxCell id="mHofzHGclkb5rKL9wMxo-5" value="Data Science" style="ellipse;whiteSpace=wrap;html=1;fillColor=#ffe6cc;strokeColor=#d79b00;opacity=50;verticalAlign=top;" parent="1" vertex="1">
<mxGeometry x="540" y="140" width="120" height="570" as="geometry" />
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>
34 changes: 34 additions & 0 deletions 1-Introduction/2-history-of-ML/HistoryOfML.drawio
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
<mxfile host="Electron" agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/24.7.17 Chrome/128.0.6613.36 Electron/32.0.1 Safari/537.36" version="24.7.17">
<diagram name="Page-1" id="8E3ydRyX4S0hx5a7mf1t">
<mxGraphModel dx="1098" dy="988" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">
<root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="sA3m44e7MPCtd9_8Xl-i-1" value="" style="shape=flexArrow;endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="80" y="360" as="sourcePoint" />
<mxPoint x="780" y="360" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="sA3m44e7MPCtd9_8Xl-i-2" value="AI Golden Age" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="160" y="370" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="sA3m44e7MPCtd9_8Xl-i-3" value="AI Winter" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="280" y="370" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="sA3m44e7MPCtd9_8Xl-i-4" value="AI Expert Systems" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="400" y="370" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="sA3m44e7MPCtd9_8Xl-i-5" value="AI Chill" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="520" y="370" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="sA3m44e7MPCtd9_8Xl-i-6" value="LLM era&lt;div&gt;Generative AI&lt;/div&gt;" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="640" y="370" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="sA3m44e7MPCtd9_8Xl-i-7" value="Machines that think" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="40" y="370" width="120" height="60" as="geometry" />
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>
142 changes: 142 additions & 0 deletions 1-Introduction/2-history-of-ML/ML-History.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
{
"title": {
"media": {
"url": "https://en.wikipedia.org/wiki/File:Alan_Turing_(1912-1954)_in_1936_at_Princeton_University_(cropped).jpg",
"caption": "Turing in 1936.",
"credit": "wikipedia/<a href='https://commons.wikimedia.org/wiki/File:Alan_Turing_(1912-1954)_in_1936_at_Princeton_University_(cropped).jpg?uselang=en#Licensing/'>wikipedia</a>"
},
"text": {
"headline": "History of machine learning <br/> 1950 - 2011",
"text": "<p>The history of artificial intelligence (AI) as a field is intertwined with the history of machine learning, as the algorithms and computational advances that underpin ML fed into the development of AI. It is useful to remember that, while these fields as distinct areas of inquiry began to crystallize in the 1950s, important algorithmic, statistical, mathematical, computational and technical discoveries predated and overlapped this era. In fact, people have been thinking about these questions for hundreds of years: this article discusses the historical intellectual underpinnings of the idea of a 'thinking machine.'.</p>"
}
},
"events": [
{
"media": {
"url": "https://s.abcnews.com/images/Entertainment/whitney-cissy-dionne-gty-er-180711_hpEmbed_21x16_992.jpg",
"caption": "Houston's with her mother and Gospel singer, Cissy Houston and cousin Dionne Warwick.",
"credit": "Cissy Houston photo:<a href='http://www.flickr.com/photos/11447043@N00/418180903/'>Tom Marcello</a><br/><a href='http://commons.wikimedia.org/wiki/File%3ADionne_Warwick_television_special_1969.JPG'>Dionne Warwick: CBS Television via Wikimedia Commons</a>"
},
"start_date": {
"year": "1950"
},
"text": {
"headline": "Machines that think",
"text": "<p>Alan Turing, a truly remarkable person who was voted by the public in 2019 as the greatest scientist of the 20th century, is credited as helping to lay the foundation for the concept of a 'machine that can think.' He grappled with naysayers and his own need for empirical evidence of this concept in part by creating the Turing Test, which you will explore in our NLP lessons.</p>"
}
},
{
"media": {
"url": "https://youtu.be/fSrO91XO1Ck",
"caption": "",
"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
},
"start_date": {
"year": "1956"
},
"text": {
"headline": "Dartmouth Summer Research Project",
"text": "The Dartmouth Summer Research Project on artificial intelligence was a seminal event for artificial intelligence as a field, and it was here that the term 'artificial intelligence' was coined (source)."
}
},
{
"media": {
"url": "https://youtu.be/fSrO91XO1Ck",
"caption": "",
"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
},
"start_date": {
"year": "1956"
},
"end_date": {
"year": "1974"
},
"text": {
"headline": "The golden years of AI",
"text": "From the 1950s through the mid '70s, optimism ran high in the hope that AI could solve many problems. In 1967, Marvin Minsky stated confidently that 'Within a generation ... the problem of creating 'artificial intelligence' will substantially be solved.' (Minsky, Marvin (1967), Computation: Finite and Infinite Machines, Englewood Cliffs, N.J.: Prentice-Hall)."
}
},
{
"media": {
"url": "https://youtu.be/fSrO91XO1Ck",
"caption": "",
"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
},
"start_date": {
"year": "1974"
},
"end_date": {
"year": "1980"
},
"text": {
"headline": "The AI Winter",
"text": "By the mid 1970s, it had become apparent that the complexity of making 'intelligent machines' had been understated and that its promise, given the available compute power, had been overblown. Funding dried up and confidence in the field slowed."
}
},
{
"media": {
"url": "https://youtu.be/fSrO91XO1Ck",
"caption": "",
"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
},
"start_date": {
"year": "1980"
},
"end_date": {
"year": "1990"
},
"text": {
"headline": "The years of the Expert systems",
"text": "As the field grew, its benefit to business became clearer, and in the 1980s so did the proliferation of 'expert systems'. Expert systems were among the first truly successful forms of artificial intelligence (AI) software."
}
},
{
"media": {
"url": "https://youtu.be/fSrO91XO1Ck",
"caption": "",
"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
},
"start_date": {
"year": "1987"
},
"end_date": {
"year": "1993"
},
"text": {
"headline": "The AI Chill",
"text": "The proliferation of specialized expert systems hardware had the unfortunate effect of becoming too specialized. The rise of personal computers also competed with these large, specialized, centralized systems. The democratization of computing had begun, and it eventually paved the way for the modern explosion of big data."
}
},
{
"media": {
"url": "https://youtu.be/fSrO91XO1Ck",
"caption": "",
"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
},
"start_date": {
"year": "1993"
},
"end_date": {
"year": "2011"
},
"text": {
"headline": "The ML Era",
"text": "This epoch saw a new era for ML and AI to be able to solve some of the problems that had been caused earlier by the lack of data and compute power. The amount of data began to rapidly increase and become more widely available, for better and for worse, especially with the advent of the smartphone around 2007. Compute power expanded exponentially, and algorithms evolved alongside. The field began to gain maturity as the freewheeling days of the past began to crystallize into a true discipline."
}
},
{
"media": {
"url": "https://youtu.be/fSrO91XO1Ck",
"caption": "",
"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
},
"start_date": {
"year": "2020"
},
"text": {
"headline": "Now",
"text": "Today machine learning and AI touch almost every part of our lives. This era calls for careful understanding of the risks and potentials effects of these algorithms on human lives. As Microsoft's Brad Smith has stated, 'Information technology raises issues that go to the heart of fundamental human-rights protections like privacy and freedom of expression. These issues heighten responsibility for tech companies that create these products. In our view, they also call for thoughtful government regulation and for the development of norms around acceptable uses' (source)."
}
}
]
}
21 changes: 21 additions & 0 deletions 1-Introduction/2-history-of-ML/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<html>
<head>
<link title="timeline-styles" rel="stylesheet"
href="https://cdn.knightlab.com/libs/timeline3/latest/css/timeline.css">
<script src="https://cdn.knightlab.com/libs/timeline3/latest/js/timeline.js"></script>
<script src="https://code.jquery.com/jquery-3.6.0.min.js"
integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script>

<script type="text/javascript" language="javascript" src="ML-History.json"></script>

<script type="text/javascript">
$.getJSON("ML-History.json", function (data) {
var timeline_json = data;
window.timeline = new TL.Timeline('timeline-embed', timeline_json);
});
</script>
</head>
<body>
<div id='timeline-embed' style="width: 100%; height: 600px"></div>
</body>
</html>
Empty file.
14 changes: 14 additions & 0 deletions 1-Introduction/4-techniques-of-ML/DanNotes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
- Decide is AI is the right approcahc for your problem
- if the problem can be solved with a well defined set of rules -> not AI
- plenty of data with useful information about your problem -> AI
- Collect and prepare your data
- cleanup, format, eliminate rows or fields
- choose features that you will use as input for predictions (suh as medical history)
- choose what you will predict - probability for a disease
- split into training data and test data, say 80% to 20%
- Train your model
- chose algorithms or use them all.
- Evaluate your model
- Tuning the model's hyperparameters
- Testing the trained model in the real-world

8 changes: 8 additions & 0 deletions 2-Regression/1-Tools/DanNotes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
python -m venv sklearn-env
sklearn-env\Scripts\activate # activate
pip install -U scikit-learn


python -m pip show scikit-learn # show scikit-learn version and location
python -m pip freeze # show all installed packages in the environment
python -c "import sklearn; sklearn.show_versions()"
3 changes: 3 additions & 0 deletions 2-Regression/1-Tools/assignment.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ Take a look at the [Linnerud dataset](https://scikit-learn.org/stable/modules/ge

In your own words, describe how to create a Regression model that would plot the relationship between the waistline and how many situps are accomplished. Do the same for the other datapoints in this dataset.

I would load the data in the column at index 1 (situps) as a numeric predictive value and the column at index 1 (waistline) as predictive target. I would split the sets in 2/3rds for training and 1/3rd for test. I would plot the resulkts of predictions against test values to confirm the corelation between situps and waistline - can the number of sitpus predict the waistline of a person.


## Rubric

| Criteria | Exemplary | Adequate | Needs Improvement |
Expand Down
321 changes: 321 additions & 0 deletions 2-Regression/1-Tools/notebook.ipynb

Large diffs are not rendered by default.