{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Files and utilities\n",
    "\n",
    "In this lecture, we will go over ways to open/close/read/write/delete files, as well as some useful system utilities.\n",
    "\n",
    "**Created and edited by**  John C.S. Lui on August 14, 2020.\n",
    "\n",
    "**Important note:** *If you want to use and modify this notebook file, please acknowledge the author.*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## File methods\n",
    "\n",
    "f = open(“filename”) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # open a file, return file value<br>\n",
    "f = open(“filename”, “w”)  &nbsp;&nbsp;&nbsp;&nbsp; # open a file for writing<br>\n",
    "f.read()  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # return a single character value<br>\n",
    "f.read(n) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #  return no more than n character values<br>\n",
    "f.readline() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #  return the next line of input<br>\n",
    "f.readlines() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #  return all the file content as a list<br>\n",
    "f.write(s) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #  write string s to file<br>\n",
    "f.writelines(lst) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #   write list lst to file<br>\n",
    "f.close()  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #  close file\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# process a line at a time\n",
    "f = open('message.txt')   # open a specific file\n",
    "for line in f:            # process a line at a time\n",
    "    print('Read in:', line)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# process a character at a time\n",
    "f = open('message.txt')   # open a specific file\n",
    "for char in f.read():      # process a character at a time\n",
    "    print('Read in:', char)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# process 2 character only\n",
    "f = open('message.txt')   # open a specific file\n",
    "for char in f.read(2):      # process 1 character at a time\n",
    "    print('Read in:', char)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# process a line only\n",
    "f = open('message.txt')   # open a specific file\n",
    "#print('linemode:', f.readline())\n",
    "\n",
    "for line in f.readline():      # process a character in this line\n",
    "    print('Read in:', line)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# process a line at a time\n",
    "f = open('message.txt')   # open a specific file\n",
    "for line in f.readlines():      # process a character in this line\n",
    "    print('Read in:', line)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# write something to the file\n",
    "f = open('message1.txt', \"w\")   # open a specific file with write permission\n",
    "f.write(\"write 1st string\\n\")   \n",
    "f.write(\"write 2nd string\\n\")\n",
    "f.write(\"write 3rd string\\n\")\n",
    "f.write(\"write 4th string\\n\")\n",
    "f.close()                       # remember to close the file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# write something to the file\n",
    "f = open('message1.txt', \"w\")   # open a specific file with write permission\n",
    "f.writelines([\"1st item\\n\", \"2nd item\\n\", \"3rd item\\n\", \"4th item\\n\"])   \n",
    "f.close()                       # remember to close the file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Operating system support\n",
    "\n",
    "At times, it will be useful to ask the server (or OS) to help us to process files.  Let's take a look."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "os.rename (\"message1.txt\", \"message2.txt\")  # rename a file\n",
    "\n",
    "# os.curdir               # show current working director\n",
    "os.system('ls -al')     # perform a long listing of files in current directory\n",
    "\n",
    "os.remove(\"message2.txt\") # want to remove the file we just created"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Recovering from exception\n",
    "\n",
    "Sometimes, when you want to open a file, there can be error (can you give example on this error?).  We want to have a way to handle this problem. Let's illustrate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    f = open('input.txt')   # try to open a file 'input.txt'\n",
    "\n",
    "except OSError:\n",
    "    print ('unable to open the file')\n",
    "else:\n",
    "    print ('continue with processing')\n",
    "    f.close()\n",
    "print ('continue')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Standard I/O\n",
    "\n",
    "- print writes characters to a file normally attached to display window\n",
    "- Input functions read from a file attached to keyboard\n",
    "- These files can be accessed through **sys** module\n",
    "- Input file : *sys.stdin*, output file: *sys.stdout*, error messages: *sys.stderr*\n",
    "- *stderr* normally also goes to *stdout*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Various input and output options\n",
    "\n",
    "- *str()* function is to return representation of values which are **human-readable**\n",
    "- *repr()* function is to generate representations which can be read by the **interpreter**\n",
    "- Many values, such as numbers or structures like lists and dictionaries, have the same representation using either function. Strings and floating point numbers have two distinct representations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "s = 'Hello world'\n",
    "print (str(s))\n",
    "print (repr(s))\n",
    "print (str(1.0/7))\n",
    "print (repr(1.0/7))\n",
    "x = 10*3.25\n",
    "y = 200* 200\n",
    "s = 'The value of x is ' + repr(x) + ', and y is ', repr(y) + '...'\n",
    "print ('str(s):', str(s), \"; repr(s):\", repr(s))\n",
    "print(s)\n",
    "hello = 'hello, world'\n",
    "hellos = repr(hello)\n",
    "print (hellos)\n",
    "repr((x,y,('spam','eggs')))\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for x in range (1,11):\n",
    "    print(repr(x).rjust(2), repr(x*x).rjust(4), end='')\n",
    "    print (repr(x*x*x).rjust(6))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## str.format \n",
    "\n",
    "It becomes a place holder, and we can use various *index* !!!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# use as place holder\n",
    "print('I am the {} who say \"{}!\"'.format('bat', 'man'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# use index to manipulate the ordering. This is what we call the positional argument\n",
    "print('I am the {0} who say \"{1}!\"'.format('bat', 'man'))\n",
    "print('I am the {1} who say \"{0}!\"'.format('bat', 'man'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# use of keyword argument\n",
    "print ('This {teacher} is {adjective}.'.format\n",
    "             (teacher='John C.S. Lui', adjective='absolutely horrible'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Positional and keyword arguments can be arbitrarily combined\n",
    "print ('The great music by {0}, {1}, and {other}.'.format\n",
    "         ('Peter', 'Paul', other='Mary'))\n",
    "print ('The great music by {0}, {other}, and {1}.'.format\n",
    "         ('Peter', 'Paul', other='Mary'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Format output {x:y}, where x is the positional argument\n",
    "#                   y is the format\n",
    "print ('The value of PI is approximately {0:.6f}'.format\n",
    "        (3.1415926))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Passing an integer after \":\" to make things neat\n",
    "\n",
    "table ={'John': 1000, 'Peter':500, 'David': 10} # define dictionary\n",
    "\n",
    "for name, amount in table.items():  # extract item from dictionary\n",
    "    print ('{0:10} ==> {1:10d}'.format(name, amount))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Mathematics library\n",
    "\n",
    "The *math* module gives access to the underlying C library functions for floating point math"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# using pi and cosine\n",
    "import math\n",
    "print (math.cos(math.pi / 4.0))\n",
    "\n",
    "print(math.log(1024,2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The *random* module provides tools for making random selection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "random.choice(['apple', 'pear', 'banana'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "random.sample(range(100), 10)   # sampling without replacement of 10 items"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "print('a random float number: ', random.random())\n",
    "print('a random integer from range(10): ', random.randrange(10))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Library for Internet Access\n",
    "\n",
    "Here, we have two modules: *urllib* and *smtplib*, for retrieving data and sending email respectively. Please read them via the Python documentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# read one html file\n",
    "\n",
    "import urllib.request\n",
    "\n",
    "response = urllib.request.urlopen('http://python.org/')\n",
    "html = response.read()\n",
    "print(html)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Library for Performance Measurement\n",
    "\n",
    "The *timeit* module measures the performance of the program"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from timeit import Timer\n",
    "\n",
    "print('time 1 = ', Timer('t=a; a=b; b=t', 'a=1; b=2').timeit())\n",
    "print('time 2 = ', Timer('a,b = b,a', 'a=1; b=2').timeit())\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}