Python Libraries

Table of Contents

This contains both standard library and third party library

1 Built-in exceptions

BaseException
 +-- SystemExit
 +-- KeyboardInterrupt
 +-- GeneratorExit
 +-- Exception
      +-- StopIteration
      +-- StandardError
      |    +-- BufferError
      |    +-- ArithmeticError
      |    |    +-- FloatingPointError
      |    |    +-- OverflowError
      |    |    +-- ZeroDivisionError
      |    +-- AssertionError
      |    +-- AttributeError
      |    +-- EnvironmentError
      |    |    +-- IOError
      |    |    +-- OSError
      |    |         +-- WindowsError (Windows)
      |    |         +-- VMSError (VMS)
      |    +-- EOFError
      |    +-- ImportError
      |    +-- LookupError
      |    |    +-- IndexError
      |    |    +-- KeyError
      |    +-- MemoryError
      |    +-- NameError
      |    |    +-- UnboundLocalError
      |    +-- ReferenceError
      |    +-- RuntimeError
      |    |    +-- NotImplementedError
      |    +-- SyntaxError
      |    |    +-- IndentationError
      |    |         +-- TabError
      |    +-- SystemError
      |    +-- TypeError
      |    +-- ValueError
      |         +-- UnicodeError
      |              +-- UnicodeDecodeError
      |              +-- UnicodeEncodeError
      |              +-- UnicodeTranslateError
      +-- Warning
           +-- DeprecationWarning
           +-- PendingDeprecationWarning
           +-- RuntimeWarning
           +-- SyntaxWarning
           +-- UserWarning
           +-- FutureWarning
	   +-- ImportWarning
	   +-- UnicodeWarning
	   +-- BytesWarning

2 Built-in

These functions are always available.

Numbers:

  • abs(x): absolute value
  • divmod(a,b): a pair (a // b, a % b)
  • max(arg1, arg2, *args)
  • min(arg1, arg2, *args)
  • pow(x,y): xy
  • round(x, ndigits=0)
  • sum(iterable)

Convertion

  • int(x)
  • float(x)
  • long(x)
  • chr(x): ASCII to char
  • ord(c): char to ASCII
  • bool(x): convert x to bool
  • hex(x): convert integer to lowercase hex string prefix with '0x'
  • oct(x): integer to octal string
  • bin(x): an integer to binary string

Boolean:

  • all(iterable): true if all items are true. empty => True
  • any(iterable): true if any item is true. empty => False
  • cmp(x,y)
    • x<y => negative
    • x=y => 0
    • x>y => positive

Symbol Table

  • locals()
  • globals()
  • dir()

Creation

  • dict
  • list
  • set
  • tuple

Other

  • len(s): length
  • next(iterator)
  • print(*objects, sep='', end='\n', file=sys.stdout)
  • range(stop): [0,stop)
  • range(start, stop, step=1)
  • sorted(iterable, cmp, key, reverse=False)
    • key=lambda x: x[1]
  • type(obj): get the type of obj
  • open(name, mode): return an object of file type.
    • r,w,a,b; + for read and write

3 Printing

  • pprint.pprint(object, stream=None): pretty print
  • 'string {0}, {hello}'.format('yes', hello=2)

4 File System

4.1 os.path

If parameter is not listed, it means a single path.

  • exists: GOOD. check whether a path exists
  • split: return a pair (head, tail). tail is the last component, without slash. If path ends with slash, tail is empty
    • basename: the tail of the split output
    • dirname: head of split output
  • normpath: collapse redundant separators and up level references
  • abspath: from relative to absolute path. normpath(join(os.getcwd(), path))
  • commonprefix(list): return the longest path prefix
  • expanduser: replace the initial component of ~ by the users directory.
  • getsize: in bytes
  • isabs: predicate for absolute
  • isfile:
  • isdir
  • islink
  • join(path, *paths): join intelligently
  • realpath: canonical path by following symbolic links

4.2 TODO pathlib

4.3 TODO tempfile

5 os

5.1 Env

  • os.environ['HOME']
  • os.getenv(name)
  • os.putenv(name, value)
  • os.unsetenv(name)

5.2 Filesystem

  • os.getcwd(): current working directory
  • os.chdir(path): change cwd
  • os.mkdir(path)
  • os.listdir(path='.'): list all in this dir. E.g. for item in os.listdir('/path'): print (item)
  • os.makedirs(path): GOOD this is the way to go the make directories
  • os.remove(path): remove a file
  • os.rmdir(): remove an empty dir.
  • os.removedirs(path): foo/bar/aaa will try to remove aaa, than bar, then foo. Don't use! To recursively remove all contents, use shutil.rmtree
  • os.rename(src, dst)
  • os.renames(old, new)
  • os.rmdir(path): only work if dir is empty
  • os.tempnam(): a reasonable absolute name for creating temporary file
    • seems to be vulnerable
  • os.walk(top, topdown=True): for each directory including top itself, it yields 3-tuple (dirpath, dirnames, filenames). E.g. for root,dirs,files in os.walk('/path'): for f in files: print (f);

5.3 shutil

  • copy(src,dst)
  • copytree(src, dst): recursive
  • rmtree(path): rm -r
  • move(src, dst)

popen family is deprecated. Use subprocess.

5.4 Process

  • os.abort()
  • os.execl(path, arg0, arg1, …)
  • os.execle(path, arg0, arg1, …, env)
  • os.execlp(file, arg0, arg1, …)
  • os.execlpe(file, arg0, arg1, …, env)
  • os.execv(path, args)
  • os.execve(path, args, env)
  • os.execvp(file, args)
  • os.execvpe(file, args, env)
  • os.folk
  • os.wait()
  • os.system(cmd): run cmd, return exit code
  • os.times(): 5-tuple
    • user time
    • system time
    • childrens user time
    • childrens system time
    • elapsed real time

6 io

  • f = open('file.txt')
  • f = io.StringIO("some string"): in memory text stream
  • f = open('file', 'rb')
  • f = io.BytesIO(b"some binary data \x00\x01")
  • support with statement: with open('file.txt') as file:

6.1 IOBase

Methods:

  • close()
  • flush()
  • readline(): return one line
  • readlines(): return a list of lines
  • seek(offset=0)
    • 0 start
    • 1 current
    • 2 end
  • tell(): current position
  • writelines(lines): write a list of lines

6.2 RawIOBase : IOBase (should not use directly)

  • read()
  • readall()
  • readinto(b)
  • write(b)

6.3 BufferedIOBase

  • read(): read all
  • write(b)

6.4 FileIO : RawIOBase

6.5 BytesIO : BufferedIOBase

6.6 BufferedReader(raw)

  • peek()
  • read()

6.7 BufferedWriter(raw)

  • flush()
  • write()

6.8 TextIOBase : IOBase

  • read()
  • readline(size=1)
  • seek(offset=0)
  • tell()
  • write(s): finally the string!

6.9 TextIOWrapper(buffer) : TextIOBase

6.10 StringIO

  • getvalue()

7 time

  • time.sleep(secs)
  • time.time(): time in seconds since epoch
  • strptime(string[, format]): parse a string into time object
    • format default: "%a %b %d %H:%M:%S %Y"
    • time.strptime("30 Nov 00", "%d %b %y")
  • strftime(format[, t]): convert from time object to string
    • %a/A: abbr/full weekday name
    • %b/B: abbr/full month name
    • %Y: year
    • %m: month [01,12]
    • %d: day of the month [01,31]
    • %H: 24-hour [00,23]
    • %I: 12-hour [01,12]
    • %p: AM or PM
    • %M: Minute [00,59]
    • %S: second [00,61]
  • gmtime(): in seconds, from epoch
  • localtime(): convert gmtime() to local
  • clock(): processor time as floating number in seconds

class time.structtime: returned by gmtime(), localtime() and strptime()

8 argparse

  import argparse
  parser = argparse.ArgumentParser(descripton='Description here')

  parser.add_argument(...)
  parser.add_argument(...)

  args = parser.parse_args()

addargument(name[, action][, nargs])

  • name: can be either. In either case, the name of variable will be args.foo
    • a name, e.g. foo
    • a list of options, e.g. ('-f', '–foo')
  • action
    • store: store what is supplied by the command line
    • store_const: no need to supply a value in command line, will store the const argument (see below)
    • store_true, storefalse: special case for storeconst
    • append: append multiple options into a list
  • const: hold values not read from command line, but from program. Use with storeconst
  • default: the value can be omited from command line, and take this default value
  • type: the default type is string, can be int, float, etc, without quotes
  • choices: this is a list of possible values to choose from
  • required: True or False, indicate whether there must be this option
  • help: the help message to be displayed
  • dest: the dest can be automatic. It can be the name of the first argument, or the name of the –foo. So don't need to set it manually. The name is added as an attributes to the returned args object.

9 Concurrent

9.1 threading

The package name is threading, the object is Thread.

Functions

  • threading.activecount(): number of Thread object
  • threading.currentthread(): current Thread object
  • threading.enumerate(): return a list of all Thread objects
  • threading.meain(): the main Thread object
  • threading.local(): the instance of local storage. Different for different threads. Typical usage: mydata = threading.local()

Two ways to specify what to run:

  • pass a callable object to the target argument when constructing Thread
  • define a subclass of Thread and override the run method.

Methods:

  • start: start the thread. It will call run method in a separate thread. The thread terminate when run terminate
  • join(timeout=None): the calling thread will block until this thread terminate
    • timeout should be float in seconds
  • is_alive: test whether the thread terminate

9.2 Thread Sync

class threading.Lock

  • acquire()
  • release()

class threading.RLock

  • this is recursive lock. The same thread can acquire the lock multiple times. They will be nested and only when the last release is called, the lock can be acquired by another thead
  • acquire()
  • release()

class threading.Condition(lock=None)

  • the lock must be a Lock or RLock. If none, a RLock is created
  • acquire()
  • release()
  • wait(timeout=None): wait until notified
    • release underlying lock
    • block until notify
    • re-acquire the lock and return
    • typical usage: while not item_is_available(): cv.wait()
    • often use with statement: =with cv: cv.waitfor(pred); get();
  • waitfor(predicate, timeout=None)
    • this is same as while not predicate(): cv.wait(), thus more convenient than wait
  • notify(n=1): notify one thread
  • notifyall(): notify all threads waiting on this condition

class threading.Semaphore: this class manage resources with limited capacity.

  • acquire(): decrease capacity
  • release(): increase capacity

class threading.Event

  • isset():
  • set(): set flag to true
  • clear(): set flag to false
  • wait(timeout=None): block until internal flag is true

class threading.Timer(interval, function) : Thread

  • interval is float in seconds, function is callable. use start method to start the thread, and the function will be called after the delay.
  • cancel(): stop the timer and cancel the execution. Only work if the the timer is still waiting.

class threading.Barrier(parties, action=None, timeout=None)

  • parties is integer. Every thread calling wait will block, until parties number of such call is called. Then all players unblock and do things simultaneously.
  • wait(timeout=None)
  • reset(): reset the barrier. The thread waiting for it will receive BrokenBarrierError
  • abort(): all current and future wait call for it will get BrokenBarrierError
  • parties: number of parties
  • nwaiting: number of current waiting
  • broken: True or False

9.2.1 Using with statement

Lock, RLock, Condition, Semaphore can be used.

with somelock:
  # do somthing

is equivalent to:

somelock.acquire()
try:
  # do something
finally:
  somelock.release()

9.3 multiprocessing

This provide multiprocessing.Process class, having similar API with Thread. It seems to use fork but don't have explicit exec on the document?? Wired and seems just do something thread can do (except the sharing of memory of course).

9.4 Process (subprocess module)

  • subprocess.run(args, *, stdin=None, input=None, stdout=None, stderr=None, shell=False, timeout=None, check=False)
    • run the command and wait for it to complete. Return a CompleteProcess instance.
    • if check is True, raise CalledProcessError exception if return code non-zero. This replace the checkcall and checkoutput.

class subprocess.CompletedProcess

  • args
  • returncode
  • stdout: captured if PIPE is passed to stdout
  • stderr: captured if PIPE is passed to stderr
  • checkreturncode(): if returncode is non-zero, raise CalledProcessError

Variables:

  • subprocess.DEVNULL
  • subprocess.PIPE
  • subprocess.STDOUT: this is only used in the place of stderr to redirect it to stdout

class subprocess.CalledProcessError

  • returncode
  • cmd
  • output: same as stdout
  • stdout
  • stderr

The followings are from 2.7, now only use run.

  • subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False)
    • args: a list of argument, including arg0
    • it can also be a string due to that *
    • it will wait, then return returncode
    • do not use stdout=PIPE, use communicate() instead TODO
    • use shell=True is bad, but it can give me
      • shell pipes
      • filename wildcard
      • env variable expansion
      • ~ expansion
  • checkcall(args, *, …): same as call, except it will raise exception if return non-0
  • checkoutput(args, *, stdin=None, stderr=None, shell=False, universalnewlines=False)
    • if return non-0, raise exception. Otherwise return the stdout

Popen object

  • Popen constructor
    • args, bufsize=0, executable=None,
    • stdin=None, stdout=None, stderr=None,
    • preexecfn=None, closefds=False,
    • shell=False, cwd=None, env=None,
    • universalnewlines=False, startupinfo=None, creationflags=0
  • Popen.poll(): check if child process has terminated. Set and return returncode.
  • Popen.wait(): wait for process to terminate. Don't use PIPE with this.
  • Popen.communicate(input=None): to use this, the corresponding stdin, stdout, stderr should be set to PIPE.
    • send data to stdin (string)
    • read data from stdout and stderr (it returns a tuple (out, err))
    • wait for termination
  • Popen.snedsignal(signal)
  • Popen.terminate(): send SIGTERM
  • Popen.kill(): send SIGKILL
  • Popen.pid
  • Popen.returncode
    • set by poll and wait (and indirectly by communicate)
    • None indicate hasn't terminated
    • -N means terminated by signal N

10 Internet

10.1 urllib.request

package urllib.request

Functions

  • urlopen(url, data=None)
    • url can be a string or Request object
    • for http and https, returns a http.client.HTTPResponse object
    • for FTP, file, data urls, return a urllib.response.addinfourl object
  • pathname2url(path): do quoting
  • url2pathname(path): do unquoting

class Request

  • constructor: (url, data=None, headers={}, method=None)
    • url: a string
    • headers: a dictionary.
    • method: a string. 'GET' is default. Available values: 'HEAD', 'POST'

methods:

  • getmethod()
  • addheader(key, val)
  • hasheader(key)
  • getheader(key)
  • removeheader(key)
  • getfullurl()
  • headeritems(): return a list of tuples (key, value)
  req = request.Request(query)
  req.add_header("Authorization", "token " + token)
  response = request.urlopen(req)
  s = response.read().decode('utf8')
  langj = json.loads(s);
  # deprecated
  urllib.request.urlretrieve(url[, filename])

10.2 urllib.parse

  • quote(string)
  • quoteplus(string)
  • unquote(string)
  • unquoteplus(string)
  • urlencode(query)

11 Data

11.1 Json

import json
json.dumps({"C": 0, "D": 1})
json.loads("a string of json")

json.dump(obj, fp, indent=2)
json.load(fp)