microsoft · dan-dragan · Sep 23, 2024 · Oct 3, 2024 · Oct 7, 2024 · Nov 26, 2024
diff --git a/.gitignore b/.gitignore
@@ -5,12 +5,17 @@
 
 dist 
 
+#pythion virtual envs
+venv*
+
 # User-specific files
 *.rsuser
 *.suo
 *.user
 *.userosscache
 *.sln.docstates
+#draw.io backups
+*.bkp
 
 # User-specific files (MonoDevelop/Xamarin Studio)
 *.userprefs

diff --git a/1-Introduction/1-intro-to-ML/DanNotes.txt b/1-Introduction/1-intro-to-ML/DanNotes.txt
@@ -0,0 +1,38 @@
+How are things learned?
+    Memorization
+        Accumulation of facts
+        Limited by:
+            Time to observe facts
+            Memory to observe facts
+        ----------
+        This is "declarative knowledge" - based on statements of truth
+        ----------
+    Generalization
+        Deduce new facts from old facts
+        Limited by:
+            Accuracy of the dedeuction process
+        Essentially a predictive activity
+        Assumes that the past predicts the future.
+        ----------
+        This is "imperative knowledge" 
+        ----------    
+
+
+Basic paradigm:
+    - provide a set of - seen, observed - training data 
+    - decide on a characteristic of that training data as representative for the issue
+    - infer something (a rule?) about the process that has generated that data
+    - use inference to make predictions about previously unseen data
+    - confirm inference using a set of test data
+
+A choice might have to be made between "Will I have false negatives or false positives allowed by my rules" and it would depend on what side is the risk higher.
+
+Issues of concern when learning models:
+    Leaned models will depend on :
+        - distance metric between examples
+        - choice of features vectors 
+        - constraints of complexity model
+            - specified or unknown number of clusters
+            - complexity of separating surface
+            - need to acoid overfitting problems like "each example is its own cluster"
+
diff --git a/1-Introduction/1-intro-to-ML/Relationship-AI-DataScience.drawio b/1-Introduction/1-intro-to-ML/Relationship-AI-DataScience.drawio
@@ -0,0 +1,25 @@
+<mxfile host="Electron" agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/24.7.17 Chrome/128.0.6613.36 Electron/32.0.1 Safari/537.36" version="24.7.17">
+  <diagram name="Page-1" id="MCpIz8lRCv8Zt3A3B-yo">
+    <mxGraphModel dx="1098" dy="988" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">
+      <root>
+        <mxCell id="0" />
+        <mxCell id="1" parent="0" />
+        <mxCell id="mHofzHGclkb5rKL9wMxo-1" value="AI" style="ellipse;whiteSpace=wrap;html=1;fillColor=#f5f5f5;fontColor=#333333;strokeColor=#666666;verticalAlign=top;" parent="1" vertex="1">
+          <mxGeometry x="160" y="240" width="510" height="370" as="geometry" />
+        </mxCell>
+        <mxCell id="mHofzHGclkb5rKL9wMxo-2" value="ML" style="ellipse;whiteSpace=wrap;html=1;fillColor=#dae8fc;strokeColor=#6c8ebf;verticalAlign=top;" parent="1" vertex="1">
+          <mxGeometry x="240" y="280" width="390" height="310" as="geometry" />
+        </mxCell>
+        <mxCell id="mHofzHGclkb5rKL9wMxo-3" value="Deep Learning" style="ellipse;whiteSpace=wrap;html=1;fillColor=#d5e8d4;strokeColor=#82b366;verticalAlign=top;" parent="1" vertex="1">
+          <mxGeometry x="290" y="310" width="330" height="260" as="geometry" />
+        </mxCell>
+        <mxCell id="mHofzHGclkb5rKL9wMxo-4" value="Neuronal Networks" style="ellipse;whiteSpace=wrap;html=1;fillColor=#f8cecc;strokeColor=#b85450;verticalAlign=top;" parent="1" vertex="1">
+          <mxGeometry x="340" y="340" width="270" height="220" as="geometry" />
+        </mxCell>
+        <mxCell id="mHofzHGclkb5rKL9wMxo-5" value="Data Science" style="ellipse;whiteSpace=wrap;html=1;fillColor=#ffe6cc;strokeColor=#d79b00;opacity=50;verticalAlign=top;" parent="1" vertex="1">
+          <mxGeometry x="540" y="140" width="120" height="570" as="geometry" />
+        </mxCell>
+      </root>
+    </mxGraphModel>
+  </diagram>
+</mxfile>
diff --git a/1-Introduction/2-history-of-ML/HistoryOfML.drawio b/1-Introduction/2-history-of-ML/HistoryOfML.drawio
@@ -0,0 +1,34 @@
+<mxfile host="Electron" agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/24.7.17 Chrome/128.0.6613.36 Electron/32.0.1 Safari/537.36" version="24.7.17">
+  <diagram name="Page-1" id="8E3ydRyX4S0hx5a7mf1t">
+    <mxGraphModel dx="1098" dy="988" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">
+      <root>
+        <mxCell id="0" />
+        <mxCell id="1" parent="0" />
+        <mxCell id="sA3m44e7MPCtd9_8Xl-i-1" value="" style="shape=flexArrow;endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="80" y="360" as="sourcePoint" />
+            <mxPoint x="780" y="360" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="sA3m44e7MPCtd9_8Xl-i-2" value="AI Golden Age" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
+          <mxGeometry x="160" y="370" width="120" height="60" as="geometry" />
+        </mxCell>
+        <mxCell id="sA3m44e7MPCtd9_8Xl-i-3" value="AI Winter" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
+          <mxGeometry x="280" y="370" width="120" height="60" as="geometry" />
+        </mxCell>
+        <mxCell id="sA3m44e7MPCtd9_8Xl-i-4" value="AI Expert Systems" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
+          <mxGeometry x="400" y="370" width="120" height="60" as="geometry" />
+        </mxCell>
+        <mxCell id="sA3m44e7MPCtd9_8Xl-i-5" value="AI Chill" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
+          <mxGeometry x="520" y="370" width="120" height="60" as="geometry" />
+        </mxCell>
+        <mxCell id="sA3m44e7MPCtd9_8Xl-i-6" value="LLM era&lt;div&gt;Generative AI&lt;/div&gt;" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
+          <mxGeometry x="640" y="370" width="120" height="60" as="geometry" />
+        </mxCell>
+        <mxCell id="sA3m44e7MPCtd9_8Xl-i-7" value="Machines that think" style="rounded=0;whiteSpace=wrap;html=1;" vertex="1" parent="1">
+          <mxGeometry x="40" y="370" width="120" height="60" as="geometry" />
+        </mxCell>
+      </root>
+    </mxGraphModel>
+  </diagram>
+</mxfile>
diff --git a/1-Introduction/2-history-of-ML/ML-History.json b/1-Introduction/2-history-of-ML/ML-History.json
@@ -0,0 +1,142 @@
+{
+	"title": {
+		"media": {
+			"url": "https://en.wikipedia.org/wiki/File:Alan_Turing_(1912-1954)_in_1936_at_Princeton_University_(cropped).jpg",
+			"caption": "Turing in 1936.",
+			"credit": "wikipedia/<a href='https://commons.wikimedia.org/wiki/File:Alan_Turing_(1912-1954)_in_1936_at_Princeton_University_(cropped).jpg?uselang=en#Licensing/'>wikipedia</a>"
+		},
+		"text": {
+			"headline": "History of machine learning <br/> 1950 - 2011",
+			"text": "<p>The history of artificial intelligence (AI) as a field is intertwined with the history of machine learning, as the algorithms and computational advances that underpin ML fed into the development of AI. It is useful to remember that, while these fields as distinct areas of inquiry began to crystallize in the 1950s, important algorithmic, statistical, mathematical, computational and technical discoveries predated and overlapped this era. In fact, people have been thinking about these questions for hundreds of years: this article discusses the historical intellectual underpinnings of the idea of a 'thinking machine.'.</p>"
+		}
+	},
+	"events": [
+		{
+			"media": {
+				"url": "https://s.abcnews.com/images/Entertainment/whitney-cissy-dionne-gty-er-180711_hpEmbed_21x16_992.jpg",
+				"caption": "Houston's with her mother and Gospel singer, Cissy Houston and cousin Dionne Warwick.",
+				"credit": "Cissy Houston photo:<a href='http://www.flickr.com/photos/11447043@N00/418180903/'>Tom Marcello</a><br/><a href='http://commons.wikimedia.org/wiki/File%3ADionne_Warwick_television_special_1969.JPG'>Dionne Warwick: CBS Television via Wikimedia Commons</a>"
+			},
+			"start_date": {
+				"year": "1950"
+			},
+			"text": {
+				"headline": "Machines that think",
+				"text": "<p>Alan Turing, a truly remarkable person who was voted by the public in 2019 as the greatest scientist of the 20th century, is credited as helping to lay the foundation for the concept of a 'machine that can think.' He grappled with naysayers and his own need for empirical evidence of this concept in part by creating the Turing Test, which you will explore in our NLP lessons.</p>"
+			}
+		},
+		{
+			"media": {
+				"url": "https://youtu.be/fSrO91XO1Ck",
+				"caption": "",
+				"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
+			},
+			"start_date": {
+				"year": "1956"
+			},
+			"text": {
+				"headline": "Dartmouth Summer Research Project",
+				"text": "The Dartmouth Summer Research Project on artificial intelligence was a seminal event for artificial intelligence as a field, and it was here that the term 'artificial intelligence' was coined (source)."
+			}
+		},
+		{
+			"media": {
+				"url": "https://youtu.be/fSrO91XO1Ck",
+				"caption": "",
+				"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
+			},
+			"start_date": {
+				"year": "1956"
+			},
+			"end_date": {
+				"year": "1974"
+			},			
+			"text": {
+				"headline": "The golden years of AI",
+				"text": "From the 1950s through the mid '70s, optimism ran high in the hope that AI could solve many problems. In 1967, Marvin Minsky stated confidently that 'Within a generation ... the problem of creating 'artificial intelligence' will substantially be solved.' (Minsky, Marvin (1967), Computation: Finite and Infinite Machines, Englewood Cliffs, N.J.: Prentice-Hall)."
+			}
+		},
+		{
+			"media": {
+				"url": "https://youtu.be/fSrO91XO1Ck",
+				"caption": "",
+				"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
+			},
+			"start_date": {
+				"year": "1974"
+			},
+			"end_date": {
+				"year": "1980"
+			},			
+			"text": {
+				"headline": "The AI Winter",
+				"text": "By the mid 1970s, it had become apparent that the complexity of making 'intelligent machines' had been understated and that its promise, given the available compute power, had been overblown. Funding dried up and confidence in the field slowed."
+			}
+		},
+		{
+			"media": {
+				"url": "https://youtu.be/fSrO91XO1Ck",
+				"caption": "",
+				"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
+			},
+			"start_date": {
+				"year": "1980"
+			},
+			"end_date": {
+				"year": "1990"
+			},			
+			"text": {
+				"headline": "The years of the  Expert systems",
+				"text": "As the field grew, its benefit to business became clearer, and in the 1980s so did the proliferation of 'expert systems'. Expert systems were among the first truly successful forms of artificial intelligence (AI) software."
+			}
+		},		
+		{
+			"media": {
+				"url": "https://youtu.be/fSrO91XO1Ck",
+				"caption": "",
+				"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
+			},
+			"start_date": {
+				"year": "1987"
+			},
+			"end_date": {
+				"year": "1993"
+			},			
+			"text": {
+				"headline": "The AI Chill",
+				"text": "The proliferation of specialized expert systems hardware had the unfortunate effect of becoming too specialized. The rise of personal computers also competed with these large, specialized, centralized systems. The democratization of computing had begun, and it eventually paved the way for the modern explosion of big data."
+			}
+		},
+		{
+			"media": {
+				"url": "https://youtu.be/fSrO91XO1Ck",
+				"caption": "",
+				"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
+			},
+			"start_date": {
+				"year": "1993"
+			},
+			"end_date": {
+				"year": "2011"
+			},			
+			"text": {
+				"headline": "The ML Era",
+				"text": "This epoch saw a new era for ML and AI to be able to solve some of the problems that had been caused earlier by the lack of data and compute power. The amount of data began to rapidly increase and become more widely available, for better and for worse, especially with the advent of the smartphone around 2007. Compute power expanded exponentially, and algorithms evolved alongside. The field began to gain maturity as the freewheeling days of the past began to crystallize into a true discipline."
+			}
+		},		
+		{
+			"media": {
+				"url": "https://youtu.be/fSrO91XO1Ck",
+				"caption": "",
+				"credit": "<a href=\"http://unidiscmusic.com\">Unidisc Music</a>"
+			},
+			"start_date": {
+				"year": "2020"
+			},	
+			"text": {
+				"headline": "Now",
+				"text": "Today machine learning and AI touch almost every part of our lives. This era calls for careful understanding of the risks and potentials effects of these algorithms on human lives. As Microsoft's Brad Smith has stated, 'Information technology raises issues that go to the heart of fundamental human-rights protections like privacy and freedom of expression. These issues heighten responsibility for tech companies that create these products. In our view, they also call for thoughtful government regulation and for the development of norms around acceptable uses' (source)."
+			}
+		}		
+	]
+}
diff --git a/1-Introduction/2-history-of-ML/index.html b/1-Introduction/2-history-of-ML/index.html
@@ -0,0 +1,21 @@
+<html>
+<head>
+    <link title="timeline-styles" rel="stylesheet" 
+              href="https://cdn.knightlab.com/libs/timeline3/latest/css/timeline.css">
+    <script src="https://cdn.knightlab.com/libs/timeline3/latest/js/timeline.js"></script>
+    <script src="https://code.jquery.com/jquery-3.6.0.min.js"
+        integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script>
+
+    <script type="text/javascript" language="javascript" src="ML-History.json"></script>
+
+    <script type="text/javascript">
+        $.getJSON("ML-History.json", function (data) {
+                var timeline_json = data;
+            window.timeline = new TL.Timeline('timeline-embed', timeline_json);
+            });
+    </script>
+</head>
+<body>      
+    <div id='timeline-embed' style="width: 100%; height: 600px"></div>
+</body>
+</html>
diff --git a/1-Introduction/3-fairness/DanNotes.txt.bak b/1-Introduction/3-fairness/DanNotes.txt.bak
diff --git a/1-Introduction/4-techniques-of-ML/DanNotes.txt b/1-Introduction/4-techniques-of-ML/DanNotes.txt
@@ -0,0 +1,14 @@
+- Decide is AI is the right approcahc for your problem
+    - if the problem can be solved with a well defined set of rules -> not AI
+    - plenty of data with useful information about your problem -> AI
+- Collect and prepare your data
+    - cleanup, format, eliminate rows or fields
+    - choose features that you will use as input for predictions (suh as medical history) 
+    - choose what you will predict - probability for a disease
+    - split into training data and test data, say 80% to 20%
+- Train your model
+    - chose algorithms or use them all.
+- Evaluate your model
+- Tuning the model's hyperparameters
+- Testing the trained model in the real-world
+
diff --git a/2-Regression/1-Tools/DanNotes.txt b/2-Regression/1-Tools/DanNotes.txt
@@ -0,0 +1,8 @@
+python -m venv sklearn-env
+sklearn-env\Scripts\activate  # activate
+pip install -U scikit-learn
+
+
+python -m pip show scikit-learn  # show scikit-learn version and location
+python -m pip freeze             # show all installed packages in the environment
+python -c "import sklearn; sklearn.show_versions()"
diff --git a/2-Regression/1-Tools/assignment.md b/2-Regression/1-Tools/assignment.md
@@ -6,6 +6,9 @@ Take a look at the [Linnerud dataset](https://scikit-learn.org/stable/modules/ge
 
 In your own words, describe how to create a Regression model that would plot the relationship between the waistline and how many situps are accomplished. Do the same for the other datapoints in this dataset.
 
+I would load the data in the column at index 1 (situps)  as a numeric predictive value and the column at index 1 (waistline) as predictive target. I would split the sets in 2/3rds for training and 1/3rd for test. I would plot the resulkts of predictions against test values to confirm the corelation between situps and waistline - can the number of sitpus predict the waistline of a person.
+
+
 ## Rubric
 
 | Criteria                       | Exemplary                           | Adequate                      | Needs Improvement          |

diff --git a/2-Regression/1-Tools/notebook.ipynb b/2-Regression/1-Tools/notebook.ipynb