{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Files and utilities\n",
"\n",
"In this lecture, we will go over ways to open/close/read/write/delete files, as well as some useful system utilities.\n",
"\n",
"**Created and edited by** John C.S. Lui on August 14, 2020.\n",
"\n",
"**Important note:** *If you want to use and modify this notebook file, please acknowledge the author.*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## File methods\n",
"\n",
"f = open(“filename”) # open a file, return file value
\n",
"f = open(“filename”, “w”) # open a file for writing
\n",
"f.read() # return a single character value
\n",
"f.read(n) # return no more than n character values
\n",
"f.readline() # return the next line of input
\n",
"f.readlines() # return all the file content as a list
\n",
"f.write(s) # write string s to file
\n",
"f.writelines(lst) # write list lst to file
\n",
"f.close() # close file\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# process a line at a time\n",
"f = open('message.txt') # open a specific file\n",
"for line in f: # process a line at a time\n",
" print('Read in:', line)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# process a character at a time\n",
"f = open('message.txt') # open a specific file\n",
"for char in f.read(): # process a character at a time\n",
" print('Read in:', char)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# process 2 character only\n",
"f = open('message.txt') # open a specific file\n",
"for char in f.read(2): # process 1 character at a time\n",
" print('Read in:', char)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# process a line only\n",
"f = open('message.txt') # open a specific file\n",
"#print('linemode:', f.readline())\n",
"\n",
"for line in f.readline(): # process a character in this line\n",
" print('Read in:', line)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# process a line at a time\n",
"f = open('message.txt') # open a specific file\n",
"for line in f.readlines(): # process a character in this line\n",
" print('Read in:', line)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# write something to the file\n",
"f = open('message1.txt', \"w\") # open a specific file with write permission\n",
"f.write(\"write 1st string\\n\") \n",
"f.write(\"write 2nd string\\n\")\n",
"f.write(\"write 3rd string\\n\")\n",
"f.write(\"write 4th string\\n\")\n",
"f.close() # remember to close the file"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# write something to the file\n",
"f = open('message1.txt', \"w\") # open a specific file with write permission\n",
"f.writelines([\"1st item\\n\", \"2nd item\\n\", \"3rd item\\n\", \"4th item\\n\"]) \n",
"f.close() # remember to close the file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Operating system support\n",
"\n",
"At times, it will be useful to ask the server (or OS) to help us to process files. Let's take a look."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.rename (\"message1.txt\", \"message2.txt\") # rename a file\n",
"\n",
"# os.curdir # show current working director\n",
"os.system('ls -al') # perform a long listing of files in current directory\n",
"\n",
"os.remove(\"message2.txt\") # want to remove the file we just created"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Recovering from exception\n",
"\n",
"Sometimes, when you want to open a file, there can be error (can you give example on this error?). We want to have a way to handle this problem. Let's illustrate."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" f = open('input.txt') # try to open a file 'input.txt'\n",
"\n",
"except OSError:\n",
" print ('unable to open the file')\n",
"else:\n",
" print ('continue with processing')\n",
" f.close()\n",
"print ('continue')\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Standard I/O\n",
"\n",
"- print writes characters to a file normally attached to display window\n",
"- Input functions read from a file attached to keyboard\n",
"- These files can be accessed through **sys** module\n",
"- Input file : *sys.stdin*, output file: *sys.stdout*, error messages: *sys.stderr*\n",
"- *stderr* normally also goes to *stdout*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Various input and output options\n",
"\n",
"- *str()* function is to return representation of values which are **human-readable**\n",
"- *repr()* function is to generate representations which can be read by the **interpreter**\n",
"- Many values, such as numbers or structures like lists and dictionaries, have the same representation using either function. Strings and floating point numbers have two distinct representations."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s = 'Hello world'\n",
"print (str(s))\n",
"print (repr(s))\n",
"print (str(1.0/7))\n",
"print (repr(1.0/7))\n",
"x = 10*3.25\n",
"y = 200* 200\n",
"s = 'The value of x is ' + repr(x) + ', and y is ', repr(y) + '...'\n",
"print ('str(s):', str(s), \"; repr(s):\", repr(s))\n",
"print(s)\n",
"hello = 'hello, world'\n",
"hellos = repr(hello)\n",
"print (hellos)\n",
"repr((x,y,('spam','eggs')))\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for x in range (1,11):\n",
" print(repr(x).rjust(2), repr(x*x).rjust(4), end='')\n",
" print (repr(x*x*x).rjust(6))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## str.format \n",
"\n",
"It becomes a place holder, and we can use various *index* !!!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# use as place holder\n",
"print('I am the {} who say \"{}!\"'.format('bat', 'man'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# use index to manipulate the ordering. This is what we call the positional argument\n",
"print('I am the {0} who say \"{1}!\"'.format('bat', 'man'))\n",
"print('I am the {1} who say \"{0}!\"'.format('bat', 'man'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# use of keyword argument\n",
"print ('This {teacher} is {adjective}.'.format\n",
" (teacher='John C.S. Lui', adjective='absolutely horrible'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Positional and keyword arguments can be arbitrarily combined\n",
"print ('The great music by {0}, {1}, and {other}.'.format\n",
" ('Peter', 'Paul', other='Mary'))\n",
"print ('The great music by {0}, {other}, and {1}.'.format\n",
" ('Peter', 'Paul', other='Mary'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Format output {x:y}, where x is the positional argument\n",
"# y is the format\n",
"print ('The value of PI is approximately {0:.6f}'.format\n",
" (3.1415926))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Passing an integer after \":\" to make things neat\n",
"\n",
"table ={'John': 1000, 'Peter':500, 'David': 10} # define dictionary\n",
"\n",
"for name, amount in table.items(): # extract item from dictionary\n",
" print ('{0:10} ==> {1:10d}'.format(name, amount))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Mathematics library\n",
"\n",
"The *math* module gives access to the underlying C library functions for floating point math"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# using pi and cosine\n",
"import math\n",
"print (math.cos(math.pi / 4.0))\n",
"\n",
"print(math.log(1024,2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The *random* module provides tools for making random selection"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"random.choice(['apple', 'pear', 'banana'])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"random.sample(range(100), 10) # sampling without replacement of 10 items"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"print('a random float number: ', random.random())\n",
"print('a random integer from range(10): ', random.randrange(10))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Library for Internet Access\n",
"\n",
"Here, we have two modules: *urllib* and *smtplib*, for retrieving data and sending email respectively. Please read them via the Python documentation."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# read one html file\n",
"\n",
"import urllib.request\n",
"\n",
"response = urllib.request.urlopen('http://python.org/')\n",
"html = response.read()\n",
"print(html)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Library for Performance Measurement\n",
"\n",
"The *timeit* module measures the performance of the program"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from timeit import Timer\n",
"\n",
"print('time 1 = ', Timer('t=a; a=b; b=t', 'a=1; b=2').timeit())\n",
"print('time 2 = ', Timer('a,b = b,a', 'a=1; b=2').timeit())\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}