{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Welcome to the CSC357 machine learning lesson 00\n", "# Numpy Tutorials\n", "### What is numpy? Why use it?\n", "> NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-\n", "mathematical functions to operate on these arrays. \n", "
As we know, machine learning application access datasets and use algorithm and train the model to learn from data, and then make the accurate predictions.\n", "
So the import step of machine learning application is exploring data, wrangling data. The machine engineer constantly use numpy, this power libraray, to access data. We will take some exapmles to learn how to use numpy below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Examples\n", "#### Two Important Datatypes in Numpy\n", "* Matrix: A matrix is a two-dimensional data structure where numbers are arranged into rows and columns.\n", "* Array: A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; The shape of an array is a tuple of integers giving the size of the array along each dimension.\n" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "# Initialization\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "# Create a matrix" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 2 3 4]\n", "array1's dimension: 1\n", "array1's shape: (4,)\n", "array1's size: 4 \n", "\n", "\n", "int32\n" ] } ], "source": [ "# Create a 1D array.\n", "\n", "array1 = np.array([1,2,3,4], dtype = np.int32)\n", "print(array1)\n", "print(\"array1's dimension: \", array1.ndim)\n", "print(\"array1's shape: \", array1.shape)\n", "print(\"array1's size: \", array1.size, \"\\n\")\n", "print(type(array1))\n", "print(array1.dtype)\n" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2 3]\n", " [4 5 6]]\n", "array2's dimension: 2\n", "array2's shape: (2, 3)\n", "array2's size: 6\n" ] } ], "source": [ "# Create a 2D array.\n", "\n", "array2 = np.array([[1,2,3], [4,5,6]])\n", "print(array2)\n", "print(\"array2's dimension: \", array2.ndim) # The array that we created is a 2D array. \n", "print(\"array2's shape: \", array2.shape) # The shape of the array is 2 rows and 3 columns, therefore shape is (2,3). \n", "print(\"array2's size: \", array2.size) # The number of elements in the array is 6." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "array3: \n", " [[0. 0. 0. 0.]\n", " [0. 0. 0. 0.]\n", " [0. 0. 0. 0.]]\n", "array4: \n", " [[1. 1. 1. 1.]\n", " [1. 1. 1. 1.]\n", " [1. 1. 1. 1.]]\n", "random array: \n", " [[0.30795646 0.11102062]\n", " [0.31321141 0.48404457]]\n" ] } ], "source": [ "# create the special array and random array (2D)\n", "\n", "array3 = np.zeros((3,4))\n", "print(\"array3: \\n\", array3)\n", "\n", "array4 = np.ones((3,4))\n", "print(\"array4: \\n\",array4)\n", "\n", "print(\"random array: \\n\",np.random.random((2,2)))\n" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "array5: \n", " [[ 1 2 3 4 5 6 7 8 9 10 11 12]]\n", "array5's shape: (1, 12) \n", "\n", "array5 after reshaping: \n", " [[ 1 2 3 4]\n", " [ 5 6 7 8]\n", " [ 9 10 11 12]]\n", "array5's shape: (3, 4)\n", "\n", "Reshape array5 without order:\n", "[[ 1 4 7 10]\n", " [ 2 5 8 11]\n", " [ 3 6 9 12]]\n" ] } ], "source": [ "# reshape an 2D array\n", "\n", "array5 = np.array([[1,2,3,4,5,6,7,8,9,10,11,12]])\n", "print(\"array5: \\n\", array5)\n", "print(\"array5's shape:\", array5.shape, \"\\n\")\n", "\n", "array5_reshaped = array5.reshape(3,4) \n", "print(\"array5 after reshaping: \\n\", array5_reshaped)\n", "print(\"array5's shape: \", array5_reshaped.shape)\n", "\n", "print(\"\\nReshape array5 without order:\")\n", "print(array5.reshape((3,4), order = 'F'))" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "array6: \n", " [[0. 0. 0. 0.]\n", " [0. 0. 0. 0.]\n", " [0. 0. 0. 0.]]\n", "array7: \n", " [[1. 1. 1. 1.]\n", " [1. 1. 1. 1.]\n", " [1. 1. 1. 1.]]\n", "array8: \n", " [[1. 1. 1. 1.]\n", " [1. 1. 1. 1.]\n", " [1. 1. 1. 1.]]\n", "array8's shape: (3, 4)\n" ] } ], "source": [ "# Operations in arrays.\n", "\n", "# Addition\n", "array6 = np.zeros((3,4))\n", "array7 = np.ones((3,4))\n", "\n", "array8 = array6 + array7\n", "print(\"array6: \\n\", array6)\n", "print(\"array7: \\n\", array7)\n", "print(\"array8: \\n\", array8)\n", "print(\"array8's shape:\", array8.shape)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "array8: \n", " [[1. 1. 1. 1.]\n", " [1. 1. 1. 1.]\n", " [1. 1. 1. 1.]]\n", "array8's shape: (3, 4)\n", "array9: \n", " [[9 9 9 9]]\n", "array9's shape: (1, 4)\n", "Additions between array8 and array9 after broadcasting: \n", " [[10. 10. 10. 10.]\n", " [10. 10. 10. 10.]\n", " [10. 10. 10. 10.]]\n" ] } ], "source": [ "# Broadcasting 1\n", "\n", "print(\"array8: \\n\", array8)\n", "print(\"array8's shape: \", array8.shape)\n", "\n", "array9 = np.array([[9,9,9,9]])\n", "print(\"array9: \\n\", array9)\n", "print(\"array9's shape: \", array9.shape)\n", "print(\"Additions between array8 and array9 after broadcasting: \\n\", array8 + array9)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "array8: \n", " [[1. 1. 1. 1.]\n", " [1. 1. 1. 1.]\n", " [1. 1. 1. 1.]]\n", "array8's shape: (3, 4)\n", "array10: \n", " [[9]\n", " [9]\n", " [9]]\n", "array10's shape: (3, 1)\n", "Additions between array8 and array10 after broadcasting: \n", " [[10. 10. 10. 10.]\n", " [10. 10. 10. 10.]\n", " [10. 10. 10. 10.]]\n" ] } ], "source": [ "# Broadcasting 2\n", "\n", "print(\"array8: \\n\", array8)\n", "print(\"array8's shape: \", array8.shape)\n", "\n", "array10 = np.array([[9],[9],[9]])\n", "print(\"array10: \\n\", array10)\n", "print(\"array10's shape: \", array10.shape)\n", "print(\"Additions between array8 and array10 after broadcasting: \\n\", array8 + array10)\n" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "array8: \n", " [[1. 1. 1. 1.]\n", " [1. 1. 1. 1.]\n", " [1. 1. 1. 1.]]\n", "array8's shape: (3, 4)\n", "Additions between array8 and constant after broadcasting: \n", " [[10. 10. 10. 10.]\n", " [10. 10. 10. 10.]\n", " [10. 10. 10. 10.]]\n" ] } ], "source": [ "# Broadcasting 3\n", "\n", "print(\"array8: \\n\", array8)\n", "print(\"array8's shape: \", array8.shape)\n", "\n", "print(\"Additions between array8 and constant after broadcasting: \\n\", array8 + 9)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "array8: \n", " [[1. 1. 1. 1.]\n", " [1. 1. 1. 1.]\n", " [1. 1. 1. 1.]]\n", "array8's shape: (3, 4)\n", "\n", "array8 after transposing: \n", " [[1. 1. 1.]\n", " [1. 1. 1.]\n", " [1. 1. 1.]\n", " [1. 1. 1.]]\n", "array8 transposed's shape: (4, 3)\n" ] } ], "source": [ "# Transpose\n", "\n", "array8_transposed = np.transpose(array8) # The rows become columns and columns become rows.\n", "print(\"array8: \\n\", array8)\n", "print(\"array8's shape: \", array8.shape)\n", "\n", "print(\"\\narray8 after transposing: \\n\", array8_transposed)\n", "print(\"array8 transposed's shape: \", array8_transposed.shape)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[4. 4. 4.]\n", " [4. 4. 4.]\n", " [4. 4. 4.]]\n" ] } ], "source": [ "# Matrix multiplication (dot product)\n", "\n", "\"\"\"\n", " For matrix multiplication, \n", " the number of columns in the first matrix must be equal to the number of rows in the second matrix\n", "\"\"\"\n", "\n", "array11 = np.dot(array8,array8_transposed) # We are doing the matrix multiplication. \n", "print(array11)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Referrences: \n", "* https://medium.com/ibm-data-science-experience/markdown-for-jupyter-notebooks-cheatsheet-386c05aeebed\n", "* https://gtribello.github.io/mathNET/assets/notebook-writing.html\n", "* https://guides.github.com/pdfs/markdown-cheatsheet-online.pdf\n", "* http://cs231n.github.io/python-numpy-tutorial/\n", "* https://docs.scipy.org/doc/numpy/reference/\n", "* https://docs.scipy.org/doc/numpy/reference/arrays.html\n", "* https://www.mathwarehouse.com/algebra/matrix/multiply-matrix.php" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }