Python Libraries
Table of Contents
This contains both standard library and third party library
1 Built-in exceptions
BaseException +-- SystemExit +-- KeyboardInterrupt +-- GeneratorExit +-- Exception +-- StopIteration +-- StandardError | +-- BufferError | +-- ArithmeticError | | +-- FloatingPointError | | +-- OverflowError | | +-- ZeroDivisionError | +-- AssertionError | +-- AttributeError | +-- EnvironmentError | | +-- IOError | | +-- OSError | | +-- WindowsError (Windows) | | +-- VMSError (VMS) | +-- EOFError | +-- ImportError | +-- LookupError | | +-- IndexError | | +-- KeyError | +-- MemoryError | +-- NameError | | +-- UnboundLocalError | +-- ReferenceError | +-- RuntimeError | | +-- NotImplementedError | +-- SyntaxError | | +-- IndentationError | | +-- TabError | +-- SystemError | +-- TypeError | +-- ValueError | +-- UnicodeError | +-- UnicodeDecodeError | +-- UnicodeEncodeError | +-- UnicodeTranslateError +-- Warning +-- DeprecationWarning +-- PendingDeprecationWarning +-- RuntimeWarning +-- SyntaxWarning +-- UserWarning +-- FutureWarning +-- ImportWarning +-- UnicodeWarning +-- BytesWarning
2 Built-in
These functions are always available.
Numbers:
- abs(x): absolute value
- divmod(a,b): a pair (a // b, a % b)
- max(arg1, arg2, *args)
- min(arg1, arg2, *args)
- pow(x,y): xy
- round(x, ndigits=0)
- sum(iterable)
Convertion
- int(x)
- float(x)
- long(x)
- chr(x): ASCII to char
- ord(c): char to ASCII
- bool(x): convert x to bool
- hex(x): convert integer to lowercase hex string prefix with '0x'
- oct(x): integer to octal string
- bin(x): an integer to binary string
Boolean:
- all(iterable): true if all items are true. empty => True
- any(iterable): true if any item is true. empty => False
- cmp(x,y)
- x<y => negative
- x=y => 0
- x>y => positive
Symbol Table
- locals()
- globals()
- dir()
Creation
- dict
- list
- set
- tuple
Other
- len(s): length
- next(iterator)
- print(*objects, sep='', end='\n', file=sys.stdout)
- range(stop): [0,stop)
- range(start, stop, step=1)
- sorted(iterable, cmp, key, reverse=False)
- key=lambda x: x[1]
- type(obj): get the type of obj
- open(name, mode): return an object of file type.
- r,w,a,b; + for read and write
3 Printing
- pprint.pprint(object, stream=None): pretty print
- 'string {0}, {hello}'.format('yes', hello=2)
4 File System
4.1 os.path
If parameter is not listed, it means a single path.
exists
: GOOD. check whether a path existssplit
: return a pair (head, tail). tail is the last component, without slash. If path ends with slash, tail is emptybasename
: the tail of the split outputdirname
: head of split output
normpath
: collapse redundant separators and up level referencesabspath
: from relative to absolute path. normpath(join(os.getcwd(), path))commonprefix(list)
: return the longest path prefixexpanduser
: replace the initial component of ~ by the users directory.getsize
: in bytesisabs
: predicate for absoluteisfile
:isdir
islink
join(path, *paths)
: join intelligentlyrealpath
: canonical path by following symbolic links
4.2 TODO pathlib
4.3 TODO tempfile
5 os
5.1 Env
- os.environ['HOME']
- os.getenv(name)
- os.putenv(name, value)
- os.unsetenv(name)
5.2 Filesystem
- os.getcwd(): current working directory
- os.chdir(path): change cwd
- os.mkdir(path)
os.listdir(path='.')
: list all in this dir. E.g.for item in os.listdir('/path'): print (item)
os.makedirs(path)
: GOOD this is the way to go the make directoriesos.remove(path)
: remove a fileos.rmdir()
: remove an empty dir.- os.removedirs(path): foo/bar/aaa will try to remove aaa, than bar,
then foo. Don't use! To recursively remove all contents, use
shutil.rmtree
- os.rename(src, dst)
- os.renames(old, new)
- os.rmdir(path): only work if dir is empty
- os.tempnam(): a reasonable absolute name for creating temporary file
- seems to be vulnerable
- os.walk(top, topdown=True): for each directory including top itself,
it yields 3-tuple (dirpath, dirnames, filenames). E.g.
for root,dirs,files in os.walk('/path'): for f in files: print (f);
5.3 shutil
- copy(src,dst)
- copytree(src, dst): recursive
- rmtree(path): rm -r
- move(src, dst)
popen family is deprecated. Use subprocess.
5.4 Process
- os.abort()
- os.execl(path, arg0, arg1, …)
- os.execle(path, arg0, arg1, …, env)
- os.execlp(file, arg0, arg1, …)
- os.execlpe(file, arg0, arg1, …, env)
- os.execv(path, args)
- os.execve(path, args, env)
- os.execvp(file, args)
- os.execvpe(file, args, env)
- os.folk
- os.wait()
- os.system(cmd): run cmd, return exit code
- os.times(): 5-tuple
- user time
- system time
- childrens user time
- childrens system time
- elapsed real time
6 io
- f = open('file.txt')
- f = io.StringIO("some string"): in memory text stream
- f = open('file', 'rb')
- f = io.BytesIO(b"some binary data \x00\x01")
- support with statement:
with open('file.txt') as file:
6.1 IOBase
Methods:
- close()
- flush()
- readline(): return one line
- readlines(): return a list of lines
- seek(offset=0)
- 0 start
- 1 current
- 2 end
- tell(): current position
- writelines(lines): write a list of lines
6.2 RawIOBase : IOBase (should not use directly)
- read()
- readall()
- readinto(b)
- write(b)
6.3 BufferedIOBase
- read(): read all
- write(b)
6.4 FileIO : RawIOBase
6.5 BytesIO : BufferedIOBase
6.6 BufferedReader(raw)
- peek()
- read()
6.7 BufferedWriter(raw)
- flush()
- write()
6.8 TextIOBase : IOBase
- read()
- readline(size=1)
- seek(offset=0)
- tell()
- write(s): finally the string!
6.9 TextIOWrapper(buffer) : TextIOBase
6.10 StringIO
- getvalue()
7 time
- time.sleep(secs)
- time.time(): time in seconds since epoch
- strptime(string[, format]): parse a string into time object
- format default: "%a %b %d %H:%M:%S %Y"
- time.strptime("30 Nov 00", "%d %b %y")
- strftime(format[, t]): convert from time object to string
- %a/A: abbr/full weekday name
- %b/B: abbr/full month name
- %Y: year
- %m: month [01,12]
- %d: day of the month [01,31]
- %H: 24-hour [00,23]
- %I: 12-hour [01,12]
- %p: AM or PM
- %M: Minute [00,59]
- %S: second [00,61]
- gmtime(): in seconds, from epoch
- localtime(): convert gmtime() to local
- clock(): processor time as floating number in seconds
class time.structtime: returned by gmtime(), localtime() and strptime()
8 argparse
import argparse parser = argparse.ArgumentParser(descripton='Description here') parser.add_argument(...) parser.add_argument(...) args = parser.parse_args()
addargument(name[, action][, nargs])
name
: can be either. In either case, the name of variable will be args.foo- a name, e.g. foo
- a list of options, e.g. ('-f', '–foo')
action
store
: store what is supplied by the command linestore_const
: no need to supply a value in command line, will store the const argument (see below)store_true
, storefalse: special case for storeconstappend
: append multiple options into a list
const
: hold values not read from command line, but from program. Use with storeconstdefault
: the value can be omited from command line, and take this default valuetype
: the default type is string, can be int, float, etc, without quoteschoices
: this is a list of possible values to choose fromrequired
: True or False, indicate whether there must be this optionhelp
: the help message to be displayeddest
: the dest can be automatic. It can be the name of the first argument, or the name of the –foo. So don't need to set it manually. The name is added as an attributes to the returned args object.
9 Concurrent
9.1 threading
The package name is threading
, the object is Thread
.
Functions
- threading.activecount(): number of Thread object
- threading.currentthread(): current Thread object
- threading.enumerate(): return a list of all Thread objects
- threading.meain(): the main Thread object
- threading.local(): the instance of local storage. Different for
different threads. Typical usage:
mydata = threading.local()
Two ways to specify what to run:
- pass a callable object to the
target
argument when constructing Thread - define a subclass of Thread and override the
run
method.
Methods:
start
: start the thread. It will callrun
method in a separate thread. The thread terminate whenrun
terminatejoin(timeout=None)
: the calling thread will block until this thread terminate- timeout should be float in seconds
is_alive
: test whether the thread terminate
9.2 Thread Sync
class threading.Lock
- acquire()
- release()
class threading.RLock
- this is recursive lock. The same thread can acquire the lock multiple times. They will be nested and only when the last release is called, the lock can be acquired by another thead
- acquire()
- release()
class threading.Condition(lock=None)
- the lock must be a Lock or RLock. If none, a RLock is created
- acquire()
- release()
- wait(timeout=None): wait until notified
- release underlying lock
- block until notify
- re-acquire the lock and return
- typical usage:
while not item_is_available(): cv.wait()
- often use
with
statement: =with cv: cv.waitfor(pred); get();
- waitfor(predicate, timeout=None)
- this is same as
while not predicate(): cv.wait()
, thus more convenient thanwait
- this is same as
- notify(n=1): notify one thread
- notifyall(): notify all threads waiting on this condition
class threading.Semaphore: this class manage resources with limited capacity.
- acquire(): decrease capacity
- release(): increase capacity
class threading.Event
- isset():
- set(): set flag to true
- clear(): set flag to false
- wait(timeout=None): block until internal flag is true
class threading.Timer(interval, function) : Thread
- interval is float in seconds, function is callable. use
start
method to start the thread, and the function will be called after the delay. - cancel(): stop the timer and cancel the execution. Only work if the the timer is still waiting.
class threading.Barrier(parties, action=None, timeout=None)
- parties is integer. Every thread calling wait will block, until parties number of such call is called. Then all players unblock and do things simultaneously.
- wait(timeout=None)
- reset(): reset the barrier. The thread waiting for it will receive
BrokenBarrierError
- abort(): all current and future wait call for it will get
BrokenBarrierError
- parties: number of parties
- nwaiting: number of current waiting
- broken: True or False
9.2.1 Using with statement
Lock, RLock, Condition, Semaphore can be used.
with somelock: # do somthing
is equivalent to:
somelock.acquire() try: # do something finally: somelock.release()
9.3 multiprocessing
This provide multiprocessing.Process class, having similar API with Thread. It seems to use fork but don't have explicit exec on the document?? Wired and seems just do something thread can do (except the sharing of memory of course).
9.4 Process (subprocess module)
- subprocess.run(args, *, stdin=None, input=None, stdout=None,
stderr=None, shell=False, timeout=None, check=False)
- run the command and wait for it to complete. Return a
CompleteProcess
instance. - if check is True, raise CalledProcessError exception if return code non-zero. This replace the checkcall and checkoutput.
- run the command and wait for it to complete. Return a
class subprocess.CompletedProcess
- args
- returncode
- stdout: captured if PIPE is passed to stdout
- stderr: captured if PIPE is passed to stderr
- checkreturncode(): if returncode is non-zero, raise CalledProcessError
Variables:
- subprocess.DEVNULL
- subprocess.PIPE
- subprocess.STDOUT: this is only used in the place of stderr to redirect it to stdout
class subprocess.CalledProcessError
- returncode
- cmd
- output: same as stdout
- stdout
- stderr
The followings are from 2.7, now only use run.
- subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False)
- args: a list of argument, including arg0
- it can also be a string due to that *
- it will wait, then return returncode
- do not use stdout=PIPE, use communicate() instead TODO
- use shell=True is bad, but it can give me
- shell pipes
- filename wildcard
- env variable expansion
- ~ expansion
- checkcall(args, *, …): same as call, except it will raise exception if return non-0
- checkoutput(args, *, stdin=None, stderr=None, shell=False, universalnewlines=False)
- if return non-0, raise exception. Otherwise return the stdout
Popen object
- Popen constructor
- args, bufsize=0, executable=None,
- stdin=None, stdout=None, stderr=None,
- preexecfn=None, closefds=False,
- shell=False, cwd=None, env=None,
- universalnewlines=False, startupinfo=None, creationflags=0
- Popen.poll(): check if child process has terminated. Set and return returncode.
- Popen.wait(): wait for process to terminate. Don't use PIPE with this.
- Popen.communicate(input=None): to use this, the corresponding stdin,
stdout, stderr should be set to PIPE.
- send data to stdin (string)
- read data from stdout and stderr (it returns a tuple (out, err))
- wait for termination
- Popen.snedsignal(signal)
- Popen.terminate(): send SIGTERM
- Popen.kill(): send SIGKILL
- Popen.pid
- Popen.returncode
- set by poll and wait (and indirectly by communicate)
- None indicate hasn't terminated
- -N means terminated by signal N
10 Internet
10.1 urllib.request
package urllib.request
Functions
- urlopen(url, data=None)
- url can be a string or Request object
- for http and https, returns a http.client.HTTPResponse object
- for FTP, file, data urls, return a urllib.response.addinfourl object
- pathname2url(path): do quoting
- url2pathname(path): do unquoting
class Request
- constructor: (url, data=None, headers={}, method=None)
- url: a string
- headers: a dictionary.
- method: a string. 'GET' is default. Available values: 'HEAD', 'POST'
methods:
- getmethod()
- addheader(key, val)
- hasheader(key)
- getheader(key)
- removeheader(key)
- getfullurl()
- headeritems(): return a list of tuples (key, value)
req = request.Request(query) req.add_header("Authorization", "token " + token) response = request.urlopen(req) s = response.read().decode('utf8') langj = json.loads(s); # deprecated urllib.request.urlretrieve(url[, filename])
10.2 urllib.parse
- quote(string)
- quoteplus(string)
- unquote(string)
- unquoteplus(string)
- urlencode(query)
11 Data
11.1 Json
import json json.dumps({"C": 0, "D": 1}) json.loads("a string of json") json.dump(obj, fp, indent=2) json.load(fp)