{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Files and utilities\n", "\n", "In this lecture, we will go over ways to open/close/read/write/delete files, as well as some useful system utilities.\n", "\n", "**Created and edited by** John C.S. Lui on August 14, 2020.\n", "\n", "**Important note:** *If you want to use and modify this notebook file, please acknowledge the author.*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## File methods\n", "\n", "f = open(“filename”)              # open a file, return file value
\n", "f = open(“filename”, “w”)      # open a file for writing
\n", "f.read()                                  # return a single character value
\n", "f.read(n)                                # return no more than n character values
\n", "f.readline()                            # return the next line of input
\n", "f.readlines()                          # return all the file content as a list
\n", "f.write(s)                               # write string s to file
\n", "f.writelines(lst)                      # write list lst to file
\n", "f.close()                                # close file\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# process a line at a time\n", "f = open('message.txt') # open a specific file\n", "for line in f: # process a line at a time\n", " print('Read in:', line)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# process a character at a time\n", "f = open('message.txt') # open a specific file\n", "for char in f.read(): # process a character at a time\n", " print('Read in:', char)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# process 2 character only\n", "f = open('message.txt') # open a specific file\n", "for char in f.read(2): # process 1 character at a time\n", " print('Read in:', char)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# process a line only\n", "f = open('message.txt') # open a specific file\n", "#print('linemode:', f.readline())\n", "\n", "for line in f.readline(): # process a character in this line\n", " print('Read in:', line)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# process a line at a time\n", "f = open('message.txt') # open a specific file\n", "for line in f.readlines(): # process a character in this line\n", " print('Read in:', line)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# write something to the file\n", "f = open('message1.txt', \"w\") # open a specific file with write permission\n", "f.write(\"write 1st string\\n\") \n", "f.write(\"write 2nd string\\n\")\n", "f.write(\"write 3rd string\\n\")\n", "f.write(\"write 4th string\\n\")\n", "f.close() # remember to close the file" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# write something to the file\n", "f = open('message1.txt', \"w\") # open a specific file with write permission\n", "f.writelines([\"1st item\\n\", \"2nd item\\n\", \"3rd item\\n\", \"4th item\\n\"]) \n", "f.close() # remember to close the file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Operating system support\n", "\n", "At times, it will be useful to ask the server (or OS) to help us to process files. Let's take a look." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "os.rename (\"message1.txt\", \"message2.txt\") # rename a file\n", "\n", "# os.curdir # show current working director\n", "os.system('ls -al') # perform a long listing of files in current directory\n", "\n", "os.remove(\"message2.txt\") # want to remove the file we just created" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Recovering from exception\n", "\n", "Sometimes, when you want to open a file, there can be error (can you give example on this error?). We want to have a way to handle this problem. Let's illustrate." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " f = open('input.txt') # try to open a file 'input.txt'\n", "\n", "except OSError:\n", " print ('unable to open the file')\n", "else:\n", " print ('continue with processing')\n", " f.close()\n", "print ('continue')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Standard I/O\n", "\n", "- print writes characters to a file normally attached to display window\n", "- Input functions read from a file attached to keyboard\n", "- These files can be accessed through **sys** module\n", "- Input file : *sys.stdin*, output file: *sys.stdout*, error messages: *sys.stderr*\n", "- *stderr* normally also goes to *stdout*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Various input and output options\n", "\n", "- *str()* function is to return representation of values which are **human-readable**\n", "- *repr()* function is to generate representations which can be read by the **interpreter**\n", "- Many values, such as numbers or structures like lists and dictionaries, have the same representation using either function. Strings and floating point numbers have two distinct representations." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s = 'Hello world'\n", "print (str(s))\n", "print (repr(s))\n", "print (str(1.0/7))\n", "print (repr(1.0/7))\n", "x = 10*3.25\n", "y = 200* 200\n", "s = 'The value of x is ' + repr(x) + ', and y is ', repr(y) + '...'\n", "print ('str(s):', str(s), \"; repr(s):\", repr(s))\n", "print(s)\n", "hello = 'hello, world'\n", "hellos = repr(hello)\n", "print (hellos)\n", "repr((x,y,('spam','eggs')))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for x in range (1,11):\n", " print(repr(x).rjust(2), repr(x*x).rjust(4), end='')\n", " print (repr(x*x*x).rjust(6))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## str.format \n", "\n", "It becomes a place holder, and we can use various *index* !!!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# use as place holder\n", "print('I am the {} who say \"{}!\"'.format('bat', 'man'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# use index to manipulate the ordering. This is what we call the positional argument\n", "print('I am the {0} who say \"{1}!\"'.format('bat', 'man'))\n", "print('I am the {1} who say \"{0}!\"'.format('bat', 'man'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# use of keyword argument\n", "print ('This {teacher} is {adjective}.'.format\n", " (teacher='John C.S. Lui', adjective='absolutely horrible'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Positional and keyword arguments can be arbitrarily combined\n", "print ('The great music by {0}, {1}, and {other}.'.format\n", " ('Peter', 'Paul', other='Mary'))\n", "print ('The great music by {0}, {other}, and {1}.'.format\n", " ('Peter', 'Paul', other='Mary'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Format output {x:y}, where x is the positional argument\n", "# y is the format\n", "print ('The value of PI is approximately {0:.6f}'.format\n", " (3.1415926))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Passing an integer after \":\" to make things neat\n", "\n", "table ={'John': 1000, 'Peter':500, 'David': 10} # define dictionary\n", "\n", "for name, amount in table.items(): # extract item from dictionary\n", " print ('{0:10} ==> {1:10d}'.format(name, amount))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mathematics library\n", "\n", "The *math* module gives access to the underlying C library functions for floating point math" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# using pi and cosine\n", "import math\n", "print (math.cos(math.pi / 4.0))\n", "\n", "print(math.log(1024,2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The *random* module provides tools for making random selection" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "random.choice(['apple', 'pear', 'banana'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "random.sample(range(100), 10) # sampling without replacement of 10 items" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "print('a random float number: ', random.random())\n", "print('a random integer from range(10): ', random.randrange(10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Library for Internet Access\n", "\n", "Here, we have two modules: *urllib* and *smtplib*, for retrieving data and sending email respectively. Please read them via the Python documentation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# read one html file\n", "\n", "import urllib.request\n", "\n", "response = urllib.request.urlopen('http://python.org/')\n", "html = response.read()\n", "print(html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Library for Performance Measurement\n", "\n", "The *timeit* module measures the performance of the program" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from timeit import Timer\n", "\n", "print('time 1 = ', Timer('t=a; a=b; b=t', 'a=1; b=2').timeit())\n", "print('time 2 = ', Timer('a,b = b,a', 'a=1; b=2').timeit())\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 2 }